As developers know, program source code is represented as lines of text.
The Abstract Syntax Tree (AST) is the representation of the source code as a hierarchical data graph, specifically a tree structure. With an AST, much of the difficult work parsing the original source code has been performed, and the syntax can be introspected programatically.
Similar to an AST, the Abstract Semantic Graph (ASG) is a graph of the semantic representation of the source code. The ASG goes one step further than the AST by representing semantic information.
When Erlang or Elixir source files are compiled, each module is converted to an
Abstract Semantic Graph and saved to a file. This file is called a BEAM file,
and it has a
Elixir provides a way to extract either the AST or ASG from source code. This information is used by tools such as Formatter and Dialyzer for the benefit of developers.
We’ll walk through two techniques for extracting information from program code.
The Elixir AST
Code.string_to_quoted!/1 converts Elixir source code into Elixir Abstract
Syntax Tree (AST).
string_to_quoted!/1 (and its sibling
string_to_quoted/1) know that the
above bit of source code is an operation on two operands. It represents the
plus sign as an atom (
:+), and it represents the two operands as a list
The Elixir AST typically contains three-element tuples like the one above. The first element is an operation or data type. The second element is metadata about the operation (e.g., source code line number), and the third element is the arguments of the operation, or in the case of a data type, the data.
Let’s try an example on a function call:
The above represents a function call with an operand
:f. In actuality, the AST
is not sure it’s actually a function call. It just knows that the expression
is “call-like”, that it takes zero arguments.
Here, the AST gives
nil to the arguments list, meaning arguments don’t apply.
The AST representation does’t actually know whether it’s a call or a variable.
In Elixir parentheses are optional for a function call, so it could be either.
We’ll do one more:
The above shows what the AST looks like for a map literal.
The BEAM File
When Elixir (or Erlang) compiles a module, it creates a
.beam file that
stores the compiled module. If code is compiled using
can be found in
.beam file can be created more directly, using
This puts the
.beam file in the current directory. We will
elixirc for the purposes of this article.
We’ll start with a sample module:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Save the above file to
my_module.ex, then run the following:
1 2 3
Erlang provides the
beam_lib library and its
chunks/2 function for reading
1 2 3 4 5 6 7 8 9 10 11 12 13 14
The first argument to
:beam_lib.chunks/2 is the
.beam file path. Note the
single quotes; it’s a charlist, not a string.
The second argument is a list of “chunk types” to extract from the
The full list of available chunk types can be found in the Erlang Source Code.
The bulk of the return data is a list of tuples. Some of the tuples contain
:attributes as the first element, and the others have
the first element.
The tuples having
:function represent the functions of the module. The third
element in the tuple is the function name.
You’ll notice a large function called
:__info__ that is automatically added to all Elixir modules.
The last three functions are the ones defined in the module’s source code.
1 2 3 4 5 6
Digging into the
:addition function representation, we can see the semantic
representation of the simple addition operation,
2 + 3:
1 2 3 4 5 6 7
If the goal is to get the list of functions defined by the module, a small filter and map is all it takes:
1 2 3 4 5 6 7 8
We’ve learned how to introspect Elixir source code by extracting the AST and ASG. It is my hope that this information will help you build the next great developer tool for Elixir.