Tario's Project: marzo 2011

sábado, 19 de marzo de 2011

How AST hook works and what are the current implementations

AST (Abstract Syntax Tree) hook is a technique to control the behavior of certain ruby node elements like method calls, global variables, etc...

First of all, certain ruby interpreters (including MRI) has an internal representation of AST which is called "node tree", each piece of code that is read by the interpreter for execution is parsed and is represented using a node tree structure. Then. this structure is read at runtime and "executed" by the interpreter core/vm.

So, the tree hook technique implies the modification of that tree before it is read and executed by the interpreter in order to perform certain "hooks". This is done by patching the tree and inserting new node elements.

Roughly, In the normal execution flow of the interpreter, the code is parsed and translated into the AST and finally executed by the vm which is part of the interpreter core:

When AST hook operate, it change the node tree after this is built and before it is executed

For example, the call node can be intercepted by changing_

The last node layout is a valid node tree structure too, which emulates the call after notify the event to a handler which decides to do

Current Implementations

MRI (Matz Ruby Interpreter) Hack

The first implementation of tree patching was a hack of the MRI which implements the patching directly to the node tree structure located in the memory of the ruby interpreter process, certain nodes pointer can be obtained in a C extension using a trick like that:



VALUE hook_block(VALUE self, VALUE handler) {
process_node(ruby_frame->node->nd_recv, handler);
}

And then, the node tree can be walked to make the patching. For example, patching the call node:


void patch_call_node(NODE* node, VALUE handler) {
NODE* args1 = NEW_LIST(NEW_LIT(ID2SYM(node->nd_mid)));
NODE* args2 = NEW_LIST(NEW_LIT(handler));

node->nd_recv = NEW_CALL(node->nd_recv, method_hooked_method, args1);
node->nd_recv = NEW_CALL(node->nd_recv, method_set_hook_handler, args2);

node->nd_mid = method_call;
}

Advantages

Easy to implement in C using interpreter code

Disadvantages

MUST be implemented in C to access the interpreter internal structures
Poor compatibility, the implementation works using internal structures of particular ruby interpreter version (e.g. wont work in ruby 1.9)

https://github.com/tario/evalhook /blob/v0.2.0/ext/evalhook_base/evalhook_base.c

Partial Ruby

In the previous detailed implementation, the main problem of the hack was the compatibility, because the tree patching is performed by changing internal structures of the interpreter which may or not may exists in the interpreter. This was done in that way because the interpreter does not expose any services in their API which serves to modify the tree (in fact, could not be any node tree in many interpreters).

The solution, is to create another interpreter which the needed services are exposed, using resources that exists in the environment: parser, api and VM.
Basically, PartialRuby parse the input ruby source file to an AST represented with ruby structures, after that, executes the ruby AST by emulating it using ruby and finally, pass the emulation ruby code to the real interpreter.

In this scenario, partial ruby expose in their API the services needed to perform the node tree patching

Advantages

Can be implemented in pure ruby
No access to ruby interpreter internals needed: compatibility granted

Disadvantages

Must re-implement a part of the Ruby VM

https://github.com/tario/partialruby

jueves, 17 de marzo de 2011

Shikashi - A flexible sandbox for ruby

Shikashi is an sandbox for ruby that handles all ruby method calls executed in the interpreter to allow or deny these calls depending on the receiver object, the method name, the source file from where the call was originated

Goals of the project

Provide a sandbox API to build up scripting services with granular control of privileges even at method and global variable levels

The API

The API of Shikashi expose two main services: a "eval" method and a privilege representation. The "eval" just run code in the sandbox and apply the restrictions specified in the privileges passed as parameter. Example


require "shikashi"

sandbox = Shikashi::Sandbox.new
privileges = Shikashi::Privileges.new

privileges.allow_method :"+" # mandatory to prevent SecurityError exception

result = sandbox.run "2+2", privileges
print result,"\n" # 4

Also, the API allows more effective control of the privileges by objects/method names and is not limited to methods, but also allows to control the access to global variables and constants


# ...
privileges.allow_global:$a # mandatory to prevent SecurityError exception

sandbox.run("$a = 4", privileges)

print $a, "\n"
# ...

Current Status

Currently, there are a stable version of the gem (0.3.1), but the compatibility only is guaranteed for very few environments, exluding many of the interesting like Heroku which I have verified that the gem did not work

Future

The next improvement for shikashi (for version 0.4.0) will be of course the compatibility, defining two specific objetives:

Making it work in Ruby 1.9
Making it work in Heroku (without UI or any web programming)

In addition, secondary objectives about API usability were defined for the next release, these include sugar syntax to make it easier to use like so


Sandbox.run "2+2", Privileges.allow_method("+")

Links:
https://github.com/tario/shikashi
http://tario.github.com/shikashi/doc/

domingo, 13 de marzo de 2011

Hello, and Welcome to the blog

Hello, welcome to Tario's Project blog. A site dedicated to announcements and informations for projects mainly developed in ruby gems. A site where you will find the last information and all the future plans that are coming.

Blog Content

This blog will not follow a specific pattern, but most post will be of one (or combination) of the following type(s):

Releases of the projects detailing new features, bugs fixed and other improvements
Announcements about future release dates with their respective details (features, bug fixes, etc...)
Creation of new projects with their details including: goals, dependencies, connection with other projects, repository, ideas, etc...
Re-factor or re-engineering and how this modifications will impact in new improvements in the near future (e.g. compatibility), and it's explanation
Technical information that affects everyone in many or all the projects (dependency maps, tooling, technical knowledge, development methodologies, etc...)
Announcements about project management: registration of new commiters, merging of external contributions, merging of forks of the projects, contributions to other open source projects
Retrospectives about the development practice and the blog itself as a communication tool part of the development process

Current Projects

About the current projects, these are mainly ruby gems and are hosted in github (http://github.com/tario). Most of the projects use ruby as principal language, but include C when applicable and necessary (e.g. evalhook). There are at least twenty projects, some with more activity than other, some dead waiting to resurrect or passing to a better live in the project heaven, and some very live. The mainly live projects are two:

Shikashi. a ruby sandbox designed for "unprivileged" code execution by restrict the allowed ruby opeations (method calls, global variable access, constant access, etc...)
ImageRuby. A flexible pure ruby library for image handling that potentially accepts plugins to add new image processing operations, new image formats and even C implementations of the processing operations to improve performance

The rest of the live projects, are dependencies of these main projects, PoC projects that depends on these or PoC of ideas for future projects (mainly about language engineering)

For example, rallhook (and dependencies), evalhook, evalmimic, getsource and partialruby are related with the sandbox and nogaku is a PoC for ruby VM in pure ruby

In the next, there will be a detailed post for each project, starting of course with the main projects and following others

Links:
https://github.com/tario