Informal Domain Specific Languages with Perl 6

class: center, middle

# Informal Domain Specific Languages <br> with Perl 6
## Brian Duggan

<span class='github'>
  <small><span class='github'><img src='./github.svg'>bduggan</span>
      <br>bduggan@matatu.org
</span>

---
class: center, middle
background-image: url(gears-background.png)

<h2>We craft software for companies that care about the details.</h2>

---
layout: true
<div class='header'>
<header>Informal Domain Specific Languages</header>
</div>
<div class="footer">
  .active[Intro]
.notyet[▶]
  .notyet[Internal]
.notyet[▶]
  .notyet[External]
.notyet[▶]
  .notyet[Variant]
</div>

---
class:bullets
## Introduction
* not talking about  HTML, SQL, CSS
* but about languages that generate things like HTML, SQL, CSS
* informal: "unofficial", "casual", expressive, colloquial
* lack rigorous specifications
* practical

---
class: bullets
## Examples
* HTML Generation (templates, wikis)
   * Template Toolkit, Jinja2, Slim
   * Markdown, Wikitext (Wikipedia)
* SQL Generation
   * DBIx::Class, Rose::DB, SQLAlchemy, Arel
* Web Microframeworks
   * Mojolicious, Dancer, Plack, Sinatra
---
class: bullets
## Categories

* *Internal*: subset of a general language <sup>1</sup><br>
  web frameworks, ORMs
* *External*: parsed <sup>1</sup><br>
  templating, wikis
* *Variant*: modification<br>
   "Slangs"
.footnote[
<small><sup>1</sup>_DSLs_, Fowler, 2010</small>
]
???
We have three broad classifications, but as we'll see in Perl 6, the
distinction can be murky.  Internal may look like variants.  Variants
are handled in a similar manner to external DSLs.
---
class: bullets
## Techniques
* *Internal*: subset of a general language .pull-right[Custom operators]
<br>
  web frameworks, ORMs
* *External*: parsed .pull-right[Grammars]
<br>
  templating, wikis
* *Variant*: modification .pull-right[Slangs]
<br>
   "Slangs"
???
For the rest of this talk, we'll look at few general techniques from
each category and apply them to a few specific examples.
---
layout: true
<div class="footer">
.notyet[Intro]
.notyet[▶]
  .active[Internal]
.notyet[▶]
  .notyet[External]
.notyet[▶]
  .notyet[Variant]
</div>

---
class: center, middle
# Internal IDSLs
# Custom Operators
---
layout: true
<div class="header">
<header>Perl 6 Custom Operators</header>
</div>
<div class="footer">
.notyet[Intro]
.notyet[▶]
  .active[Internal]
.notyet[▶]
  .notyet[External]
.notyet[▶]
  .notyet[Variant]
</div>

---
class:bullets
## operator types
* infix: A + B
* prefix: -A
* postfix: A++
* circumfix: [A]
* post-circumfix: A[B]
--

* Unary operators take 1 argument
* Binary operators take 2 arguments
--

* Infix operators also have a noun form, `&[+]`

```code
say &[+](1,2)
```

--
* Subs that take two args can be used as infix binary operators

```code
sub plus-twice($a,$b) { $a + 2 * $b }
say 1 [&plus-twice] 2
```

---
## Custom operators
Custom operators are declared like this:
```code
sub infix:<plus>($x,$y) {
    $x + $y
}
say 1 plus 2;
```
--
```output
3
```
--

```code
sub prefix:<@@>($x) { $x * 2 }
sub postfix:<+++>($y is rw) { $y+=3 }

my $z = @@10;
$z+++;
say $z;
```

--
```output
23
```

---
### Custom operators

* All of unicode is fair game

<img src='dot.svg'>
```code
# dot product
sub infix:<∙>(@a,@b) {
  return [+] @a Z* @b
}

say (1,2) ∙ (3,4)

```

```output
11
```
---
### Custom operators

* All of unicode is fair game

```code
sub circumfix:<⌊ ⌋>($x) {
  $x.floor
}
say ⌊2.4⌋	
```

```output
2
```

---

## Combining operators
```code
sub infix:<plus>($x,$y) {
    $x + $y
}
sub infix:<times>($x,$y) {
    $x * $y
}
say 1 plus 2 times 3;
```
--
```output
9
```

???
But multiplication has higher precedence than addition.  We want this to be 7.
---
## Precedence
`is tighter` controls precedence

```code
sub infix:<plus>($x,$y) {
    $x + $y
}

sub infix:<times>($x,$y) is tighter(&infix:<plus>) {
    $x * $y
}

say 1 plus 2 times 3;
```

--
```output
7
```
Also `is looser`, `is equiv`

---
## Chaining operators
```code
sub infix:<to-the-power>($x,$y) {
    $x ** $y
}

say 2 to-the-power 3 to-the-power 2
```
--
```output
64
```
--
<font size='+10'>
  2<sup>3<sup>2</sup></sup>
</font>
treated as
<font size='+10'>
  (2<sup>3</sup>)<sup>2</sup>
</font>
but should be
<font size='+10'>
  2<sup>(3<sup><sup>2</sup></sup>)</sup>
</font>
---

## associativity
We can fix this.  `is assoc`!
```code
sub infix:<to-the-power>($x,$y) is assoc<right> {
    $x ** $y
}

say 2 to-the-power 3 to-the-power 2
```
--
```output
512
```
--
Associativity types:<br>
`right`
<br>
`left`
<br>
`non`
<br>
`chain` (`1 < 2 < 3`)
<br>
`list`  (`1,2 X 3,4 X 5,6`)

---
## argument types

define subtraction between strings

```code
sub infix:<->($x,$y) {
  $x.subst($y, "", :g);
}
say "house" - "u";
```

--
```output
"hose"
```
--
But
```code
say 32 - 2
```
--
```output
3
```
--
Argh!
---
## multiple dispatch

We can fix this, too.

```code
multi infix:<->(Str $x, Str $y) {
  $x.subst($y, "", :g);
}
say "house" - "u";
say 32 - 3;
```
--
```output
hose
29
```
---

## multiple dispatch

Also works for strings, ints, constants, or any class.

```code
multi sub infix:<->(Str $x, Str $y) {
  $x.subst($y, "", :g);
}
multi sub infix:<->(Str $x, Int $y) {
    $x.substr(0,$x.chars-$y)
}
multi sub infix:<->("escalator","electricity") {
    "stairs"
}
say "catamaran" - "a";
say "catamaran" - 6;
say "escalator" - "electricity";
say 10 - 5;
```
--

```output
ctmrn
cat
stairs
5
```

---

## example
emulate python % operator

```code
multi sub infix:<%>(Str $f, Numeric $n) {
    return sprintf($f,$n)
}
multi sub infix:<%>(Str $f,List $l) {
    return sprintf($f,$l.flat)
}

say 'This is %d.' % 40;
say 'Pi is about %0.2f and e is about %0.2f' % ( π, e );
```
???
Custom operators are an "internal" language, but may look like a variant
--

```output
This is 40.
Pi is about 3.14 and e is about 2.72
```
---
<h2>example: generating SQL</h2>

* DBIx::Class, Rose::DB::Object, SQL::Abtract
* SQLAlchemy, Squeel
* Typical techniques:
  * method chaining
  * operator overloading
  * data structure abuse

Perl 6 operators add new techniques:

.normal[
```SQL
SELECT id
FROM user INNER JOIN address ON address.user=user.id
WHERE name = 'ed' and fullname='Ed Jones'
```
```code
(User + Address)[ name == 'ed' and fullname == 'Ed Jones' ]
```
]
* native operators `and`, `==` can be defined for columns
* post-circumfix [ ] can be used to filter

---
<h2>example: generating SQL</h2>

.normal[
```code
(User + Address)[ name == 'ed' and fullname == 'Ed Jones' ]
```
]

```code
class Table { ... }

multi sub infix:<+>(Table $a, Table $b) {
  ...
}
```

---
<h2>example: generating SQL</h2>

.normal[
```code
(User + Address)[ name == 'ed' and fullname == 'Ed Jones' ]
```
]

```code
class Filter { ... }
class Column { ... }

multi sub infix:<==>(Column $a, $b) {
  ...
  return Filter.new( ... )
}

multi sub infix:<and>(Filter $a, Filter $b) {
  ...
  return Filter.new( ... )
}

```
???
Though as far as operators and SQL go, there is already
a precedent.  The mathematical foundation of SQL is
relational algebra.
---

## example: generating SQL
### Relational algebra

* Created by E.F. Codd in 70s
* Mathematical foundation of SQL
* Defines operators such as:
  * Projection (π)
  * Selection (σ)
  * Rename (ρ)
  * Natural join (⋈)
  * Semijoin (⋉)(⋊)
  * Antijoin (▷)
  * Division (÷)
  * Left outer join (⟕)
  * Right outer join (⟖)

---
## example: an ORM
**Relational algebra**
```code
sub infix:<⋈>($a, $b) {
  "$a NATURAL JOIN $b"
}
say "users" ⋈ "addresses"
```
```output
users NATURAL JOIN addresses
```
```code
sub prefix:<∏>(@x) {
  "SELECT " ~ @x.join(', ')
}
say ∏<name age>;
```
```output
SELECT name, age
```

Conclusion: Custom operators are useful building blocks for internal informal DSLs.
---
layout: true
<div class="footer">
.notyet[Intro]
.notyet[▶]
  .notyet[Internal]
.notyet[▶]
  .active[External]
.notyet[▶]
  .notyet[Variant]
</div>
---
class: center, middle
# External IDSLs
## Grammars
???
half way: 20 minutes
---
layout: true
<div class="header">
<header>Grammars</header>
</div>
<div class="footer">
.notyet[Intro]
.notyet[▶]
  .notyet[Internal]
.notyet[▶]
  .active[External]
.notyet[▶]
  .notyet[Variant]
</div>
---
class: bullets
## HTML Generation
* Templating
   * Template Toolkit, Jinja2, Slim
   * about 100 listed on wikipedia
* Wikis
   * Markdown, MediaWiki
   * 15 "lightweight markup language" listed wikipedia

Let's look at how to use grammars to parse one of them.

---

## Slim<br>http://slim-lang.com

Looks like this:
```slim
html
  head
    title Slim Examples
  body
    h1 Markup examples
```
generates
```html
<html>
  <head>
    <title>Slim Examples</title>
  </head>
  <body>
    <h1>Markup examples</h1>
  </body>
 </html>

```

---
class: bullets
### Parsing Slim
1. Parse
2. Build a DOM
---

### Parsing Slim
* A grammar is a collection of regexes (like a class + methods)
* A token is a regex with no backtracking
* A rule is a regex with significant whitespace

```code
grammar slim {
    rule TOP { <line>+ %% <eol>}
    token line   { <indentation> <tag> [ ' ' <text> ]?  }
    token indentation { <indent>* }
    token indent { '  ' }
    token tag    { \w+ }
    token text   { \V+ }
    token eol    { \n+ }
}
say slim.parse(q:to/DONE/);
html
  head
    title Slim Examples
  body
    h1 Markup Examples
DONE
```
* `X %% Y` means `[ X ][ Y X ]*`
---

<h3>Parsing Slim</h3>
.right[

```code
html
  head
    title Slim Examples
  body
    h1 Markup examples

```
```code
grammar slim {
    rule TOP { <line>+ %% <eol> }
    token line   {
      <indentation>
      <tag> [ ' ' <text> ]?
    }
    token indentation { <indent>* }
    token indent { '  ' }
    token tag    { \w+ }
    token text   { \V+ }
}
```
]
.left[
```output
line => ｢html
        tag => ｢html｣
        indentation => ｢｣
line => ｢  head｣
        tag => ｢head｣
        indentation => ｢  ｣
         indent => ｢  ｣
line => ｢    title Slim Examples｣
        tag => ｢title｣
        text => ｢Slim Examples｣
        indentation => ｢    ｣
         indent => ｢  ｣
         indent => ｢  ｣
...
```
A grammar generates a match object.
]
---

## Generating a DOM
* A grammar can be associated with an object, to perform "actions".
* Methods on the object have the same names as the regexes in the grammar.
* When a regex is reached, the method is called.
--

* Let's make a class called "DOM" and make an instance.

.clearfix[
.right[
```code
grammar slim {
  token tag { ... }
  token text { ... }
  token indentation { ... }
  ...
}
```
]
.left[
```code
class DOM {
  method tag { ... }
  method text { ... }
  method indentation { ... }
}
my $dom = DOM.new;
slim.parse($text, actions => $dom);
```
]
]
<br>
---

## Generating a DOM

#### Algorithm:
0. Make a **Node** class which has a parent node + child nodes.<br>
   The nodes will be the DOM tree.  Also we will need a stack.
1. When you see a **tag**, **push** a new node onto the stack.
2. When you see **text**, set the text of the top node.
3. When you see **indentation**, **pop** until the stack size is the
   level of indentation.
4. Connect parent + child nodes when moving from the stack to the tree.

Tag: push.<br>
Indentation: maybe pop.<br>

---

.left[
# stack
<img src='img/stack-0-0.png'>

# tree
<img src='img/tree-0-0.png'>

]
.right[
```code
*0 html
1   head
2     title Slim Examples
1   body
2     h1 Markup examples
0
```

]
.bottom[
indent 0, stack 0: push
]
---

.left[
# stack
<img src='img/stack-0-1.png'>

# tree
<img src='img/tree-0-1.png'>

]
.right[
```code
*0 html
1   head
2     title Slim Examples
1   body
2     h1 Markup examples
0
```

]
.bottom[
indent 0, stack 0: push
]
---

.left[
# stack
<img src='img/stack-1-0.png'>

# tree
<img src='img/tree-1-0.png'>

]
.right[
```code
0 html
*1   head
2     title Slim Examples
1   body
2     h1 Markup examples
0
```

]
.bottom[
indent 1, stack 1: push
]
---

.left[
# stack
<img src='img/stack-1-1.png'>

# tree
<img src='img/tree-1-1.png'>

]
.right[
```code
0 html
*1   head
2     title Slim Examples
1   body
2     h1 Markup examples
0
```

]
.bottom[
indent 1, stack 1: push
]
---

.left[
# stack
<img src='img/stack-2-0.png'>

# tree
<img src='img/tree-2-0.png'>

]
.right[
```code
0 html
1   head
*2     title Slim Examples
1   body
2     h1 Markup examples
0
```

]
.bottom[
indent 2, stack 2: push
]
---

.left[
# stack
<img src='img/stack-2-1.png'>

# tree
<img src='img/tree-2-1.png'>

]
.right[
```code
0 html
1   head
*2     title Slim Examples
1   body
2     h1 Markup examples
0
```

]
.bottom[
indent 2, stack 2: push
]
---

.left[
# stack
<img src='img/stack-3-0.png'>

# tree
<img src='img/tree-3-0.png'>