top of page

LIRQ — Language Integrated Reflection Queries

Language Integrated Query (LINQ) is an extension to Microsoft .NET languages that provides query expressions that can be used to extract and process data from collections (for example, arrays, lists, dictionaries).

Concrete syntax of query expressions resembles SQL statements:

IEnumerable<Person> query;
query = from p in Persons where p.age > 90 select p;

In this post, I will introduce query expressions for program introspection in object-oriented languages. For example, query

{ field, in class <T>, <T> extends MyClass, of type int }

yields all integer fields declared in subclasses of MyClass.

Reflections on introspection… (pun intended)


What is LIRQ?

Queries can yield the following entities:

  • identifiers,

  • local variables,

  • function parameters,

  • arrays,

  • collections,

  • fields of classes,

  • classes,

  • instances of classes,

  • interfaces,

  • enumerations, and

  • functions (methods).

A query consists of one or more conditions written within curly braces and separated by Boolean operations:

  • comma , for conjunction

  • vertical bar | for disjunction

  • exclamation mark ! for negation.

In this post, I will assume that queries are embedded into Java/C#, though the concept itself does not depend on a particular language.

Value, name and type queries

For an identifier x:

  • query { &x } yields its value,

  • query { @x } yields string “x”, and

  • query { #x } yields type of x , which can be used in declarations:


int x;
{ #x } y; // int y;

Primitive queries

For each entity mentioned above, there is a corresponding query ({var}, {class}, {field}, and so on) that yields all such entities. For example, query {var} will yield all variables. To yield a non-empty result, a query should contain at least one primitive condition.

Regular expressions for names

Query {'v*'} yields all identifiers with names starting with symbol “v”. Queries can be used in qualified names, too:

person.{'a?e'}

Type constraints

Query {var, of type int} yields all local integer variables. It can be used in an assignment statement:

{var, of type int} = 0;

Constraints

Query


{ var, of type int, (that >= 0 | that <= 10) }

yields all local integer variables whose value is in range 0..10.

Keyword that refers to an yielded result of a query. Negation can also be used within that expressions. Queries &that , @that and #that are considered primitive queries. For example, {var, @that} yields a collection of names of all variables.

Query variables

Query

{ var <T>, of type int, ( <T> >= 0 | <T> <= 10 ) }

is equivalent to query

{ var, of type int, (that >= 0 | that <= 10) }

given above, but uses a query variable T that refers to yielded result. Variable names are enclosed in angle brackets (remark: this syntax has nothing to do with generics).

Query variables can also be used for types and essentially all other entities, for example:

{var <X>, of type <Y>, <Y> is subtype of int}

Functions

Query {function <F>() returns <R>} yields all functions without arguments visible in the current scope. Desired parameters can be requested by using regular expressions-like syntax:

  • ? denotes any parameter,

  • * denotes 0 or more parameters,

  • + denotes 1 or more parameters,

  • int denotes an integer parameter, and so on.

Query {function <F>(?, int, *) returns string} yields all string functions whose second argument is of type integer.

Qualifiers

Query {class <T>, <T> extends MyClass} yields all classes that extend MyClass. Part ... extends ... of this query is called a qualifier. Other qualifiers include:

  • is abstract

  • is static

  • ... implements ...

  • ... inherits ...

  • is subtype of ...

  • has ... (used to specify that a class has a certain field or method),

  • and so on.


Declared entities

Qualifier declared ... allows distinguishing between an yielded result and a condition in a query. For example, query

{class <B>, <B> extends <A>, class <A>}

is invalid because it has two primitive conditions ( class <B> and class <A>). However, query

{class <B>, <B> extends <A>, declared class <A>}

is valid and yields all subclasses of all classes.

Instances

Statement

{instance of Person}.age = 0;

assigns value 0 to field age of all instances of class Person. Depending on how semantics of queries is defined, instances may either refer to all declared instances of a class or to all instances existing during runtime.

Loops

Queries can be used in for-each loops:

for x in {var x, of type int} {
  x = 0;
}

Scopes

Query {field, of type int, in declared class MyClass} yields all integer fields in class MyClass. Keyword declared can be omitted in in conditions.

Query {in function(int, int) returns <R>, var} yields all local variables in all functions (from the current scope) with two integer arguments.

Nested queries

Query

{in {function(int, int) returns <R>, in class MyClass}, var}

differs from the previous one in that it only considers methods of class MyClass.

Visibility modifiers

Queries can be used to define custom visibility modifiers.

class A {
 modifier children = {class <T>, <T> extends A};
 [children] int x; // only visible in subclasses of A
   ...


Queries as first-class citizens

New primitive type query is introduced to represent reflection queries.

query a = {class, with constructor <X>()};
query b = {{a}, that extends MyClass};
<<b>> x = new <<b>>(); // parameterized statement;
                       // it creates instances of all subclasses of
                       // MyClass that have an explicit constructor

In this examples, query a yields all classes that have a constructor without parameters. Query b refines this query additionally requiring that those classes extend MyClass. An instance is then created of each matching class.

To typecheck queries, primitive type query should be annotated with the “type” of entities that a query yields. In the example above, complete definitions of a and b will be:

query<class> a = ...
query<class> b = ...

Consider now another example:

query<type> t = {#that, var, @that.startsWith('a')};
<<t>> x;

Query t yields types of all variables whose name starts with symbol “a”. These types are used then in the parameterized declaration statement.

The following query increases all integer variables by 1.

query<var> q = {var, of type int};
<<q>> = <<q>> + 1;

Type annotations (<class> , <type> , <var>, etc.) of queries might not need to be specified explicitly as they can be inferred in most cases from queries themselves.


Kinds of queries

How could it be possible to represent the “machinery” of a query so that one could “compare” them? A possible answer might be to define kinds of queries, in a way somewhat similar to kinds of types.

For example, kind of query {var, of type int} is VAR*, denoting that it yields some variable subject to some conditions, while kind of query {var age} is VAR (without the star), because it yields a specific variable define in the query itself.

For query

{instance of <T>, declared class <T>}
the kind is CLASS* -> INST* , whereas kind of query
{instance of MyClass}
is CLASS -> INST*.
Finally, for query
{
 declared class <A>, declared class <B>,
 <A> extends <B>,
 field, <A> has that
}

its kind is CLASS* -> CLASS* -> FIELD*, denoting that this query has conditions on two classes and yields a field.


Semantics and implementation

Query {var, of type int} yields all integer variables from the current scope. This query can be used in an assignment statement:

int x, y, z;
{var, of type int} = 0;   // x = 0; y = 0; z = 0;

In compile-time semantics, queries are essentially treated as macros, and the assignment above is transformed into a sequence of assignments

x = 0;
y = 0;
z = 0;

In run-time semantics, the assignment is transformed to (Java/C#/…) code that emulates the query using corresponding reflection API.

For queries with constraints, such as {var, of type int, that > 0}, only run-time semantics shall be defined. Design of LIRQ is still very experimental. I am implementing an early prototype using language workbench JetBrains MPS that allows extending Java with new constructs.

Since MPS uses projectional editing, there are no issues with possible ambiguities in concrete syntax of reflection queries (for example, no problem at all with curly braces in queries vs. curly braces in compound statements in Java).

Semantics of reflection queries is enabled by model transformation and code generation mechanisms of MPS. In compile-time semantics, generated code can also be statically typechecked.

Some related work

  • Object Constraint Language allows specifying conditions of the form context Person inv: self.age >= 0 in a manner similar to reflective queries.

  • Constraints on type parameters in LINQ and wildcards in Java are similar to qualifiers in the terminology of this paper.

  • EcmaScript has support for computed names of object members, which resembles name queries.

  • C# has nameof expression that returns name of an identifier — this is exactly what “@” queries do.


Further ideas

  • Queries like {int, that > 10} can be seen as dependent types.

  • Similarly, queries type number = {int | float}; resemble type classes (for example, in Haskell) and “multitypes”.

  • Reflection queries may be extended to access the AST, for example, query {declared [for loop] <S>, [counter] of <S>, @that} would yield names of all counters in all for loops.



Source: Medium


The Tech Platform

0 comments
bottom of page