sábado, 19 de noviembre de 2011

Fastruby 0.0.16 released with ruby 1.9 support

Fastruby is a gem which allows to execute ruby code much faster than normal, currently in a state of transition between a spike and a usable gem, it is released when possible with incremental improvements.
The v0.0.16 release simply adds support for ruby 1.9, also fastruby was fixed for ruby 1.8 running with rvm (ruby1.8 patch 334), the big disclaimer here is that ruby1.9 specific code such fibers and new hash syntax won't work and fastruby is slower than ruby1.9 (both issues will be solved in next releases)
All code working with fastruby on ruby1.8 now will work with fastruby on ruby1.9.

Install

You can clone the repository at github:
git clone git://github.com/tario/fastruby.git
git checkout v0.0.16
Or install it using gem install:
gem install fastruby 
Platforms Supported
  • Ruby 1.8 patch 352
  • Ruby 1.8 patch 334 (tested using rvm)
  • Ruby 1.9 patch 180 (tested using rvm)


jueves, 10 de noviembre de 2011

Migrating everthing to ruby1.9

Since the warning I get from a excellent talk of @yhara at @RubyConfAr, ruby 1.9 is the present while ruby 1.8 is the past. The next step regarding my projects is to ensure that all works on ruby 1.9 (but keeping the compatibility with ruby1.8 at the same time)
This is the status of the more important projects.

ImageRuby

All tests on ImageRuby spec works with 1.9,2-p180 thanks to the contribution fix of amadanmath. Of course I will be listen to bug reports regarding ruby1.9 compatibility

Shikashi (and dependencies)

All tests on Shikashi spec works with 1.9,2-p180. Specific ruby 1.9 syntaxis are not convered by the spec. I have created a new issue for this (the issue corresponds to partialruby gem)

To make shikashi fully compatible with ruby1.9, I must find a replacement for RubyParser which is not compatible with some of ruby 1.9 specific syntax, for example, RubyParser is unable to recognize the new syntax for hashes of ruby 1.9

Fastruby

This is the most complex case, these are the remaining issues:
  • Main headers of fastruby won't compile since these sources use syntax not supported by ruby1.9
  • Use of C-Api specific to ruby1.8
  • Remove support for syntax specific to ruby1.8 (only when run under ruby1.8)
  • Add support for syntax specific to ruby1.9 (only when run under ruby1.9)
  • RubyInline does not work with ruby1.9
  • RubyParser does not work with ruby1.9

I will fork or find a fork of rubyinline to fix issues related with ruby1.9 compatibility, for the case of RubyParser (affecting shikashi too) I will replace parsing using Ripper which is the native out-of-the-box implementation of parsing on ruby1.9

martes, 1 de noviembre de 2011

Fastruby 0.0.15 released. Callcc and continuation objects!

Fastruby is a gem which allows to execute ruby code much faster than normal, currently in a state of transition between a spike and a usable gem, it is released when possible with incremental improvements.
The v0.0.15 release adds the support for continuation objects created using callcc function. To acchieve this, the implementation uses the non-lineal stack created on previous releases.
Also this release adds a few extra language support improvements such as method default arguments

Install

You can clone the repository at github:
git clone git://github.com/tario/fastruby.git
git checkout v0.0.15
Or install it using gem install:
gem install fastruby

New Features

Fixes

Examples of code being supported

Example of continuation calls, tail recursion


require "fastruby"

fastruby '
class X
def fact( n )
a = callcc { |k| [ n, 1, k ] }
n = a[0]
f = a[1]
k = a[2]
if ( n == 0 ) then return f
else
k.call n-1, n*f, k
end
end
end
'

p X.new.fact(6) # 720

Default Arguments
require "fastruby"

fastruby '
class X
def foo(a,b = a.reverse)
a + b
end
end
'

p X.new.foo("13") # 1331
p X.new.foo("xx", "yy") # xxyy

Passing proc objects as blocks
require "fastruby"

fastruby '
class X
def bar
yield(1)
yield(2)
yield(3)
end

def foo
pr = proc do |x|
p x
end

bar(&pr) # passing proc as a block
end
end
'

X.new.foo

Receiving blocks as proc objects
require "fastruby"

fastruby '
class X
def bar(&pr)
pr.call(1)
pr.call(2)
pr.call(3)
end

def foo
bar do |x|
p x
end
end
end
'

X.new.foo


domingo, 23 de octubre de 2011

Fastruby 0.0.14 released with support for method replacement

Fastruby is a gem which allows to execute ruby code much faster than normal, currently in a state of transition between a spike and a usable gem, it is released when possible with incremental improvements.
The v0.0.14 release make a internal redesign of the method hash structure to allow the store of function pointer pointers, this is what allows to reference a fixed function pointer which may change enabling method replacement while retaining most of the performance (fastruby calls now consume a extra indirection and a few extra checks)

Install

You can clone the repository at github:
git clone git://github.com/tario/fastruby.git
git checkout v0.0.14
Or install it using gem install:
gem install fastruby

New Features


Examples of code being supported

Replacement of methods

require "fastruby"

fastruby '
class X
def foo(a)
a+1
end
end
'

x = X.new
p x.foo(4) # 5

fastruby '
class X
def foo(a)
a*2
end
end
'

p x.foo(4) # 8


Remaining uncovered items on v0.0.14 release
  • Memory handling of function pointers on method hash (will be fixed for v0.0.15)
  • Fix limitation of 15 arguments when calling method defined on fastruby from normal ruby
  • Fix duplication of objects on cache when methods are replaced
  • Optimization of splat callls, currently this calls are executing using normal rb_funcall2 (will be improved for v0.2.0)
  • Optimization of calls to methods with variable number of arguments, currently are wrapped using rb_funcall (will be improved for v0.2.0)

domingo, 9 de octubre de 2011

Fastruby 0.0.13 released with support for splat arguments and method with array arguments

Fastruby is a gem which allows to execute ruby code much faster than normal, currently in a state of transition between a spike and a usable gem, it is released when possible with incremental improvements.
The v0.0.13 release of fastruby make a little internal design improvement regarding translator class and the implementation of splat arguments and methods with optional arguments (see examples below)

Install

You can clone the repository at github:
git clone git://github.com/tario/fastruby.git
git checkout v0.0.13
Or install it using gem install:
gem install fastruby

New Features



Examples of code being supported

Multiple arguments and call with splat

require "fastruby"

fastruby '
class X
def foo(*array)
array.each do |x|
p x
end
end

def bar(*args)
foo(*args)
end
end
'

x = X.new
x.foo(1,2,3) # prints 1,2 and 3
x.bar(4,5,6) # prints 4,5 and 6

Remaining uncovered items on v0.0.13 release

domingo, 2 de octubre de 2011

Fastruby 0.0.11 released with support for retry, redo and foreach

Fastruby is a gem which allows to execute ruby code much faster than normal, currently in a state of transition between a spike and a usable gem, it is released when possible with incremental improvements.
The v0.0.11 release of fastruby make internal improvements regarding non-local jumps and implements foreach support, including retry and redo statements (both applicable to any kind of block, and retry applicable on rescue clause)

Install

You can clone the repository at github:
git clone git://github.com/tario/fastruby.git
git checkout v0.0.11
Or install it using gem install:
gem install fastruby

New Features


Examples of code being supported

For each with redo and retry

require "fastruby"

fastruby '
class X
def foo
sum = 0

for i in (4..8)
p i # 4 4 4 4 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7 8
sum = sum + i
redo if sum < 20
retry if sum < 100 and i == 7
end

sum
end
end
'

x = X.new
x.foo # 46


miércoles, 28 de septiembre de 2011

Fastruby 0.0.9 released with support for proc

Fastruby is a gem which allows to execute ruby code much faster than normal, currently in a state of transition between a spike and a usable gem, it is released when possible with incremental improvements.
Following the release of version v0.0.8 which did implement several internal improvements including a new non-linear stack structure, the new release v0.0.9 implements the support for procedure objects based in the previous implementation

Install

You can clone the repository at github:
git clone git://github.com/tario/fastruby.git
git checkout v0.0.9
Or install it using gem install:
gem install fastruby

New Features


Examples of code being supported

return method from proc

require "fastruby"

fastruby '
class X
def foo
pr = Proc.new do
return "return foo from inside proc"
end

pr.call

return "return foo"
end
end
'

x = X.new
x.foo # this returns "return foo from inside proc"

The behavior of Proc objects (those created using Proc.new) are very similar to the behavior of lambdas, but with a few little differences:
  • It is possible to do non-local return's from Proc.new of the method where the block was defined, this non-local return will fail with LocalJumpError when the method scope is closed in the moment of invoking the return (in lambdas, return only acts as next; ie, only ends the lambda function)
  • Break jumps are illegal from Proc.new blocks and always fails with LocalJumpError (in lambdas, break acts as next)

As additional note, block objects created using proc method are exact the same block objects created using lambda method, only check the the ruby 1.8 source code to see something like that:

   ...
rb_define_global_function("proc", proc_lambda, 0);
rb_define_global_function("lambda", proc_lambda, 0);
...





lunes, 19 de septiembre de 2011

Fastruby roadmap up to 0.5.0

Fastruby development (up to 1.0.0 release) will be divided into two main phases:
  • Incremental Proof of Concept. Transition from a base spike/PoC (version 0.0.1) to usable gem (version 0.1.0).
  • Usable versions. Transition from minimally usable gem (version 0.1.0) to stable API release (version 1.0.0)
This is a rough roadmap and might change in the future if new ideas or optimizations are discovered

PoC Incremental Versions


Version 0.0.8: lambdas (already released)
Version 0.0.9: proc
Version 0.0.10: Gem packaging fixes

Version 0.0.11: for, retry, redo
Version 0.0.12: Full exception support

Version 0.0.13: Methods with variable number of arguments

Version 0.0.14: Support for method replacement

Version 0.0.15: continuation objects


Etc...

Version 0.1.0: First usable release
  • Any ruby code executable by MRI can be executed by fastruby

Usable Versions (Following semantic versioning)

Zero used as mayor version number is used when API is changed every day without backward compatibility (despite the fact that will be backward compatibility when possible). Also, on 0.X.X versions the API is completely undefined.
Bugs and spec un-compliance of 0.1.0 release will be fixed on subsequent stabilization releases (0.1.1, 0.1.2, etc...)

Version 0.1.0: Execution of Ruby

Features:
  • Any executable code in ruby is executable via fastruby and gives the same results
  • Interoperation between ruby and fastruby is granted

Version 0.2.0: Fast execution of ruby, internal optimizations

Features
Version 0.3.0: Fast execution of ruby, C extension API

  • C extension points
    • API to define methods in C, by class, method name and signature allowing replacement of ruby standard lib implementation in the future.

Version 0.4.0: Fast execution of ruby, C extension API 2

  • C extension points
    • Macro methods (keeping dynamism). Examples:
      1. Fixnum#+(x:Fixnum) => INT2FIX(FIX2INT(self)+FIX2INT( x) )
      2. String#[](x:Fixnum) => rb_str_new2(RSTRING(self)->ptr[FIX2INT(x)],1)
      3. String#[]=(x:Fixnum,y:Fixnum) =>
        RSTRING(self)->ptr[FIX2INT(x)]=FIX2INT(y)

Version 0.5.0: Faster standard lib

  • Ruby standard lib re-implemented in a separated gem optimized for fastruby
  • Fastruby works with and without optimized stdlib

Fastruby 0.0.8 released with support for lambdas

Fastruby is a gem which allows to execute ruby code much faster than normal, currently in a state of transition between a spike and a usable gem, it is released when possible with incremental improvements.
The version v0.0.8 was released with the new feature of supporting lambdas. Functional improvement does not appear to be important, but the underlying internal implementation will allow new features in the next release such as the support for proc and continuation objects.

Install

You can clone the repository at github:
git clone git://github.com/tario/fastruby.git
Or install it using gem install:
gem install fastruby
New Features
Examples of code being supported

Scope retention (lambda has a reference to the scope where it was created)

require "fastruby"

fastruby '
class X
attr_accessor :lambda_set, :lambda_get
def initialize
a = nil

@lambda_set = lambda{|x| a = x}
@lambda_get = lambda{a}
end
end
'

x = X.new

x.lambda_set.call(88)
p x.lambda_get.call

Yield block of the scope

require "fastruby"

fastruby '
class X
def bar
lambda_obj = lambda{
yield
}
lambda_obj
end

def foo
a = 77
bar do
a
end
end
end
'

x = X.new

lambda_obj = x.foo
p lambda_obj.call # 77


viernes, 16 de septiembre de 2011

Callcc puzzle on fastruby IV: Execution stack as a DAG, lambda cases

In order to implement lambdas and continuations (in a near future) a new structure to represent the stack was implemented, this will be available in 0.0.8 release of fastruby as base of the lambda support. Call with current continuation will be implemented on 0.0.9.

The implementation will cover two main cases:
  • Normal lineal stack (keeping performance when possible)
  • Non-linear stack (DAG) used when lambdas are created
While the case of continuation will remain uncovered (for now) being probably the storage of native stack the most important outstanding issue to solve in order to achieve the implementation of continuation.

New Stack Structure


To represent a ruby stack, a new structure was created: StackChunk. A stack chunk is a representation of a linear stack, allocating and de-alocatting memory from this scopes is far simpler than dynamic memory allocations (e.g. malloc)

Since the ruby stack is non-linear, StackChunk are used as the longest linear sections and chained following a DAG pattern. it is like the internal representation of frames used by the MRI with the difference that stack chunks may contain many scopes, even all the scopes in cases of representing stacks without divergences, in these cases a single dynamic memory allocation may be enough for all scopes needed for the execution.

Three main cases

There are three main cases of interest where the new structure has relevance

Case 1: normal linear stack, single stack chunk

The StackChunk (like the native stack) is an entity that grows dynamically linearly. So, in the case of representing a linear stack only one stack chunk is needed and this stack will grows dynamically as new scopes are allocated

The stack chunk is created the first time a fastruby method creates a scope. At an arbitrary point, the stack look like this:



When a ruby method (builded with fastruby) is called, a scope to hold local variables and (other information such the passed block) is created, the memory where this scope goes live is provided by the stack chunk and the stack chunk. There is no need to allocate ruby objects, nor even dynamic memory in most cases. So, linear stack chunk grows as native stack grows:



While empty circles represents ruby objects referenced by local variables, native stack hold native C pointers to scopes allocated in the stack chunk

Since locals variables lives on these stack chunks instead of native stack, this locals can survive beyond the C frame where they were created.

Case 2: Lambdas, frozen and multiple stack chunks to represent DAG stack

Each time a lambda object is created, fastruby attach a ruby reference to the StackChunk being used at the moment of create the lambda, also, the state of that StackChunk and all his parents is changed to frozen, a frozen stack chunk cannot create new scopes to keep alive all the scopes at the moment when
freeze it, but the variables living on that stack chunk may still change their values



For example, if a lambda is created in a given stack point, rollback a few frames and creates a new ones (e.g. by calling another methods), the state will be something like that:


Red arrows represent ruby references between ruby objects, this references are created only with the purpose of keep the related stack chunks alive on garbage collection; ruby lambda objects store a reference to the stack chunk at the moment of create the lambda. To handle the dynamic memory used by the stack, it is wrapped (gray arrows) by a ruby object of class StackChunk. So, the memory is freed when the garbage collector indicate that it must be disposed when the object has no references.

Black arrows indicates C native pointers, native stack holds native local variables pointers to scopes (representation of ruby scopes including ruby local variables) living on the native heap allocated by the stack chunk

Green arrows indicates native references (VALUE) to ruby objects (represented by empty circles), to avoid these object being disposed by the GC, the stack chunk must implement the mark method and mark each object referenced in that way. These values can changed both on frozen and active stack chunks.

Case 3: Continuation objects (NOT IMPLEMENTED YET)

The handling of dynamic and DAG stack chunks is solved in the same way as in the case of the lambda: each time a continuation object is created (by using callcc ruby method), a reference to the stack chunk being used on that moment is attached to him to avoid that stack chunk being disposed by the GC.
The new problem to overcome now, is the recording of what would be the representation of ruby frames on fastruby, this representations are native local variables living in the native stack (including the most important one, the pointer to the scope), since the native stack is linear expires their data while decrease and overwrites the old data when it grows back (e.g another method is called).
Maybe, the solution in this case is to save the native stack together with the stack chunk and restore it each time a continuation object is called, but that is topic for another article.



miércoles, 7 de septiembre de 2011

Callcc puzzle on fastruby III: Almost solved?

First of all, the plan of migrate to C++ was canceled to use ruby Garbage Collector instead to handle the resources used by the fastruby stack entities

The cost of instantiating objects on ruby

Since ruby garbage collector works only with ruby objects, I need to encapsulate as ruby objects whatever I want to have managed by the GC. The first option that comes to mind is encapsulate every local variables scope into a ruby object: bad idea, this will instantiate a ruby object for each scope and the cost of instantiating objects on ruby is of an higher order than a single memory allocation on C; the issue of this approach can be appreciated on this spike which adds a instantiation of a ruby object (using rb_objnew) for each scope to see the impact on general performance measured used benchmark scripts, the result was the increase of the execution time of the main benchmark by 3X, enough to reject the approach of one ruby object per locals scope.

Stack Chunk

To solve the issue of multiple ruby object allocations, multiple locals scopes will be grouped and encapsulated as a single ruby object, specifically those belonging the same stack. So stack chunks will grow dynamically (using C malloc) as locals scopes are allocated and all dynamic memory used by a stack chunk will be de-allocated when it is removed by ruby GC.
For the cases of lambdas, proc and continuations, StackChunk objects will be referred from a ruby local variable in the ruby scope, so, lambdas duplicating ruby stacks will make new references to stack chunks. Stack chunks will become immutable/frozen when lambdas or continuation objects are created to prevent overwriting local variables with new scopes, local variables living on immutable/frozen stack chunks may be changed but frozen chunks rejects the allocation of new scopes.

Stack Chunk properties
  • All locals variables will be placed/allocated on the current stack chunk at the moment of initialize the function frame
  • Each stack chunk is wrapped as a ruby object of type StackChunk to provide to the ruby garbage collector the interface to release stack chunks from memory (Data_Wrap_Struct) when it have no references.
  • StackChunk will be referred from the local variable scope of MRI (Matz Ruby Interpreter) to prevent this object to be removed until they stop using it (e.g. leaving the scope where the stack chunk is created); on the internal implementation of MRI, lambdas, procs and continuation objects retain references to MRI stack frames, so the issue of lambdas and continuations will be solved (in part) with this.
  • Calcc, lambdas and procs will turn the state of the current stack chunk into frozen state, this implies that new scopes can't be created in that stack chunk. When new scopes are being required, a new stack chunk will be created
Implementation

See https://github.com/tario/fastruby/commits/dynamic_locals

martes, 6 de septiembre de 2011

Callcc puzzle on fastruby II: The return of C++

In a previous post, I have explained the problem of implementing callcc while keeping the performance on fastruby; and then the focus was changed to a simpler issue: lambdas.

The approaching to the lambda issue allow us to recognize the non-linear nature of the ruby stack: while the native stack grows allocating space for local variables and decreases deallocating that space, the non-linear stack of ruby allows keeping isolated local variables scopes associated to lambda blocks (while previous scopes in the stack are deallocated) and even allows retain all the scopes of a stack for continuation objects created with callcc

The solution for lambda case is as simple as changing the place where the local variable scope lives: heap instead of stack.

Making use of dynamic memory is not a game, and if misused it may be hurt or even kill (the ruby process, after a relatively slow death with cancer)


Release of allocated memory and C++

Each allocation must have their corresponding de-allocation. It's that simple. But not so simple when programming in C, because [...]
RAII provided by C++ make more easier to code scopes of resources initialization and finalization (dynamic memory, in this case), by using the destructor of clases to release the resources. this covers function returns and C++ exceptions. It does not cover longjmp (this will be analyzed later on this post). See the wikipedia article about RAII for more info and code examples

Reference Counting

Immediately after creating a lambda object, the local variable scope is referenced twice: one reference from the lambda object itself, and the other reference from the scope living in the stack (imagine that many lambdas created in the same scope adds more references to the count). In this situation, when the execution leaves the scope of the function, it de-allocate the local variables struct, the only thing it should do is to decrement the reference count of the local variables by one unit, and the release should only occur when this count reach zero. This algorithm is called Reference Counting.

Longjmp and C++ Raii

Using of C function setjmp/longjmp for non-local GoTO's are incompatible with C++ RAII since RAII depends on the destructors of objects instantiated on the stack are called. For example, a longjmp crossing a scope allocating recurses will prevent the release of these resources.
Apparently, the solution to this issue is to replace the current use of setjmp/longjmp with c++ exceptions (the same risk is present when using setjmp/longjmp), but after a spike I realized this might not be a good idea:

test.c using longjmp and setjmp
#include "setjmp.h"

jmp_buf jmp;

void function2() {
longjmp(jmp,1);
}
void function1() {
setjmp(jmp);
}
int main() {
int i;
for (i=0;i<200000000;i++) {
function1();
}
return 0;
}

It took 2.241 seconds

test.cpp using try/catch and throw
#include "exception"

void function2() {
throw std::exception();
}
void function1() {
try {
function2();
} catch (std::exception) {
}
}
int main() {
int i;
for (i=0;i<200000000;i++) {
function1();
}
return 0;
}

It was never finished even after hours

Conclusion

C++ exceptions will always be slower than setjmp/longjmp because C++ exceptions use non-local goto's as part of their implementation and other complexities. I must look for another workaround for the longjmp detail

Links

lunes, 5 de septiembre de 2011

Callcc puzzle on fastruby

Callcc or "Call with Current Continuation" is the implementation of coroutines on ruby, inherited from another languages such scheme and lisp, callcc allows the creation of continuation objects which hold the context information (stack frame position, local variables of the entire stack, etc...) and can be called to make a non-local goto jump to that context like longjmp and setjmp functions in C (see this explanation of callcc in ruby for more info)
The difference between callcc/Continuation#call and setjmp/longjmp is that callcc allows goto to forward, backward in the stack and even to a side stack while setjmp only allows jump backward. This imply that while setjmp/longjmp works fine with linear stack, callcc will not work in that way and will need a non-linear stack, a complex tree dynamic struct instead a chunk of memory growing when needed (the native stack)

Maybe something easier: lambdas

For example, a code like this, on released version 0.0.7 shouldn't work:
require "fastruby"

fastruby '
class X
def foo
a = 32
lambda {|x|
a+x
}
end
end
'

p X.new.foo.call(16) # it will return 48, or it will fail?

But it works! why?... the current implementation allocs scopes for ruby local variables as a local variable of a struct type, in C, the local variables live in native stack in a fixed address which is passed as block parameter when calling lambda (using rb_funcall, etc..), this memory address remains associated to the block passed to lambda and then that address is used from inside the block like a normal block invocation. The difference here is that the scope struct used from the lambda block is deprecated and not longer valid since that stack space is below the stack pointer

To obtain the example of failure (in the grand tradition of TDD ;) ), we must call any another method after calling the method returning the lambda and before calling the lambda

require "fastruby"

fastruby '
class X
def foo
a = 32
lambda {|x|
a+x
}
end

def bar
end
end
'

lambda_object = X.new.foo # it will return 48, or it will fail?
X.new.bar # this should not affect the lambda object since it is not related
lambda_object.call(16) # but it does, and this does not return 48 as expected

The call to lambda does not return 48 as expected, in fact, it could return any unpredictable value even an invalid object causing a segmentation fault. The reason is that the call to X#bar overwrites the deprecated stack space used by the local scope associated to the lambda block; in this situation, the value of the "local variable" may be any unpredictable random garbage in the stack

Make it pass: alloc locals on the heap using malloc

And it pass, simply by moving the allocation of local variable scopes to heap instead of stack. Memory allocated on the heap never will be deprecated... but it must be de-allocated or the ruby process will die young of cancer. It is late at night and I will not write a test to assert ruby does not die with cancer when six billon of objects are allocated, but surely the de-allocation of local variable scopes will be performed as a refactor task.

And the scope deallocation is the real key issue here

But it is a topic for another post...

Links

domingo, 28 de agosto de 2011

Fastruby v0.0.6 (PoC) released

Fastruby is a gem which allows to execute ruby code faster than normal (about 20X of the MRI1.8), see this post for more info

Improved ruby support

In the first gem released version of fastruby (v0.0.1), the ruby support was very limited since that version was designed as a spike (e.g. it can't handle block calls, exceptions, break, etc...).
These ruby support improvements are based on frame structures used at runtime and some other tricks such non-local goto's to implement return, next, break; this make the generated code a bit more expensive but still 20X faster (instead of 100X of the first version)

Now (version 0.0.6) fastruby support a wide range of ruby constructions, such:
  • Exceptions (begin, rescue, ensure, end)
  • Classes and modules
  • Constants, globals
  • Flow control sentences if, unless, and case..when..end
  • Literals
  • Singleton methods (almost as slow as normal ruby)
  • Ruby built-in goto's (break, return, next)
But for now, leaves out the following:
  • Class variables (*)
  • Methods with multiple number of arguments (*)
  • Callcc (*)
* Issue created for task

Native object cache to reduce bootstrap overhead

Usually the execution of ruby code on fastruby implies parsing the code, translate to C, and build by compiling it using gcc .
Even small code snippets may take a few seconds, a lot of time comparing it with the execution of same code using MRI, which takes hundredths of a second and imagining what would happen with larger code...
The response to this issue is the implementation of the cache, transparent to the invoker. In the image, a script is executed by first time to show it takes 0,642 seconds, and when the same script is executed again, the same script takes 0,057 seconds. The script test.rb is a very simple test:
require "fastruby"

fastruby '
print "hello world\n"
'

The cache is located on $HOME/.fastruby and the cache feature can be deactivated by setting the environment variable FASTRUBY_NO_CACHE to 1 when execute a ruby script that uses fastruby.

Implementation details of cache

Each code snippet has a SHA1 sum associated and each SHA1 has a collection of native libraries including both the main object and multiple built of the methods defined in that snippet.
The diagram is only a conceptual model and does not represent any entity in fastruby source code, the sha1 association is implemented using the filesystem by saving each object collection in a directory named as the hexadecimal sha1 of the corresponding code snippet (inspired by git internals :D )




Links

Fastruby github page: https://github.com/tario/fastruby
Previous post on fastruby: http://tario-project.blogspot.com/2011/07/fastruby-v001-poc-released.html

sábado, 30 de julio de 2011

Fastruby v0.0.1 (PoC) released

Fastruby is a gem which allows to execute ruby code faster than normal (about 100X of the MRI1.8)

Fastruby IS NOT a separated ruby interpreter. Then, the design is simple

Fastruby IS NOT a DSL to generate C code using ruby or a Ruby to C translator, the goal of fastruby is to execute RUBY code

Core Concepts

Native build


All code processed by fastruby ends with a native representation, in the current version, this is acomplished using RubyParser to parse rubycode.
The ruby code is translated to C and then processed with RubyInline

Transparent multimethods (and multiblocks)

The methods processed by fastruby has multiple internal implementations depending on the type of the arguments.
Each possible signature has a version of the method and this are built in runtime when the method is called with a new signature.
The same concept will be applied to blocks (anonymous methods) in future releases

Type inference

Each version of a method is built for a specific signature, so, the builder can assume a type for the arguments and build method calls using that assumption.
Whereever the translator can asume a type for a expression involved in a method call (used as argument or as receiver), this information can be used to encode
direct calls instead of normal and expensive ruby calls.

The currently implementation only can infer types for method and block arguments, and for literals

Customization through build directives and API

To compensate for the described limitations, fastruby suport a few build directives to allow the programmer help the inference.
The syntaxis of these directives are the same as normal ruby call (see examples)
Also, fastruby will define a API to customize aspects of fastruby internals. E.g the build method to invoke the build of methods with a specific signature (see examples)

The Image

The image of this post shows a display with the execution of the current benchmarks of the test

All of them follow the pattern of executing the code with normal ruby and with fastruby. The fourth benchmark adds a few additional measures (see the benchmark source for more info)

You can locate the benchmark sources installed in the gem directory of fastruby or in the github repository and the full resolution version of the image here

Installation


The install is as simple as execute the well-known gem install:

sudo gem install fastruby

Examples

The syntax is simple since one of the main of goals of fastruby is to be transparent. The current API serves for testing and customization of this poc version of fastruby

Prebuild of methods:
 require "fastruby"

class X
fastruby '
def foo(a,b)
a+b
end
'
end

X.build([X,String,String] , :foo)

p X.new.foo("fast", "ruby") # will use the prebuilded method
p X.new.foo(["fast"], ["ruby"]) # will build foo for X,Array,Array signature and then execute it

NOTE: this is not necessary to do this, the methods will built automatically when there are called for first time

Variable "types"

Like static languages, you can define a type for a variable to help the inference and gain some performance. This can be done by using lvar_type directive

 class X
fastruby '
def foo
lvar_type(i, Fixnum)
i = 100
while (i > 0)
i = i - 1
end
nil
end
'
end


With no lvar_type, the calls to Fixnum#> and Fixnum#- will be dynamic and more expensive

Links

domingo, 5 de junio de 2011

Shikashi v0.5.0 released

Shikashi is a sandbox for ruby that handles all ruby method calls executed in the interpreter to allow or deny these calls depending on the receiver object, the method name, the source file from where the call was originated

For more info about the project, visit the project page

You can install the gem by doing

gem install shikashi

New Enhancements

The primary focus of this latest release was improve the performance of execution of ruby code in the sandbox but maintaining the purity of ruby (i.e. without implementing anything with C or similar)

Code packets

In the previous version of shikashi (0.4.0), to execute code in the sandbox, you do:
Shikashi::Sandbox.run("1+1", Shikashi::Privileges.allow_method(:+))
This, internally, implies:
  • Parse the code
  • Process the syntax tree (evalhook)
  • Emulate the transformed tree (partialruby)
  • Execute the emulation code using eval...
This was a great performance problem in cases where you need to execute the same piece of ruby code many times in the sandbox (e.g. sandboxed web code executed for each incoming request)
To solve this, the concept of "code packet" was implemented; a "code packet" is the ultimate product of the above mentioned processing chain ready to be executed as many times as necessary without reprocess the code and tree. Example:
packet = Shikashi::Sandbox.packet(code)
# after that, you can run the "packet" as many times as you want without doing all
# parsing, tree processing and emulation stuff internally
packet.run(privileges)
packet.run(privileges)
packet.run(privileges)
packet.run(privileges)

The performance difference is of the order of 20X, you can check by doing the following benchmark:

require "rubygems"
require "shikashi"
require "benchmark"

code = "class X
def foo(n)
end
end
X.new.foo(1000)
"

s = Shikashi::Sandbox.new

Benchmark.bm(7) do |x|

x.report("normal") {
1000.times do
s.run(code, Shikashi::Privileges.allow_method(:new))
end
}

x.report("packet") {
packet = s.packet(code, Shikashi::Privileges.allow_method(:new))
1000.times do
packet.run
end
}

end

In my laptop, the result of this benchmark is:

            user     system      total        real
normal 5.870000 0.540000 6.410000 ( 6.436331)
packet 0.290000 0.020000 0.310000 ( 0.315351)

The repeated execution of code using packets is 20 times faster than normal

Refactor of evalhook internals

Evalhook is the support gem that provides the ability to hook events on the execution of ruby code. Evalhook implemenations initially was not desgined for performance, then it's possible to refactor the implementation to increase the performance by supressing unnecessary calls and other optimizations.

require "rubygems"
require "shikashi"
require "benchmark"

s = Shikashi::Sandbox.new

class NilClass
def foo
end
end

Benchmark.bm(7) do |x|

x.report {

code = "
500000.times {
nil.foo
}
"
s.run code, Shikashi::Privileges.allow_method(:times).allow_method(:foo)
}

end



The result without the optimizations:
            user     system      total        real
36.140000 1.420000 37.560000 ( 37.672708)


The result with the optimizations:
            user     system      total        real
27.460000 1.130000 28.590000 ( 28.658031)


The difference is not so significant because most of the time is used by privileges checking in both cases. Similar bechmarks executed against evalhook directly throws greater differences

Links

sábado, 28 de mayo de 2011

Released ImageRuby-Devil - a bridge between ImageRuby and Devil image library

ImageRuby-devil is the bridge between ImageRuby and Devil image library, it adds a group of new features to ImageRuby which are based on devil features:
  • Load and save image formats supported by devil library including jpg, tga, png, etc...
  • New image operations including "alienify", "blur" and "contrast"
The development, release and usage of this gem was predicted in the first ImageRuby release announcement

Install the gem


In the command line, execute:
gem install imageruby-devil

Example

# is not necessary to make a explicit require of imageruby-devil
require "imageruby"

image = ImageRuby::Image.from_file("input.jpg") # now, jpg files can be loaded

image.color_replace!(Color.black, Color.purple) # use a method of ImageRuby
image.blur(1) # use a method of added by imageruby-devil

image.save("output.png", :png) # save (and load) in png format is now supported

Links:

ImageRuby-devil at github: https://github.com/tario/imageruby-devil
ImageRuby at github: https://github.com/tario/imageruby
ImageRuby-bmp-c: https://github.com/tario/imageruby-bmp-c
Devil image library official site: http://openil.sourceforge.net/

viernes, 29 de abril de 2011

Released Shikashi v0.4.0 (and dependencies)

Shikashi is an sandbox for ruby that handles all ruby method calls executed in the interpreter to allow or deny these calls depending on the receiver object, the method name, the source file from where the call was originated

For more info about the project, visit the project page

You can install the gem by doing


gem install shikashi


New Enhancements

removed evalmimic dependency

evalmimic is a gem to emulate the behavior of the binding argument of the eval method (the default binding), to allow the API to do this:

a = 5
Sandbox.run("a+1", Privileges.allow_method(:+)) # return 6


The implementation of evalmimic is somewhat complex because it relies in a C extension and even implements some low level hacks using unsupported features of MRI (e.g. evalmimic will not compile and install for ruby 1.9)

So it was decided to remove the evalmimic dependency from shikashi and evalhook, and remove the feature shown in the above example. The only difference now is that you must add the binding as parameter if you decide to execute the sandbox in that way.

a = 5
Sandbox.run("a+1", Privileges.allow_method(:+), binding) # return 6


And if you do not specify the binding, the default behavior is use the global binding nested in the sandbox namespace


Sugar Syntax

As seen in previous code examples, it's no longer necessary to instanciate lot of objects in order to execute code in the sandbox, only Sandbox.run and Privileges now use method chaining syntax. Example:

require "shikashi"

Sandbox.run('print "hello world\n"', Privileges.allow_method(:print))

$a = 1
Sandbox.run('print $a, "\n"',
Privileges.allow_method(:print).allow_global_read(:$a)
)



Control over read access of constants and global variables

Now, you must grant read privileges over global variables and constants in order to allow the read access to them. By default, trying to access to global variables and constants will result on SecurityError exceptions. Constants defined inside the base namespace of the sandbox are allowed by default (e.g. classes defined in the same code)

# this will work
include Shikashi
Sandbox.run("
class X
def foo
end
end
X.new.foo
", Privileges.allow_method(:new))

$a = 4
Sandbox.run("$a", Privileges.allow_global_read(:$a)) # 4

A = 4
Sandbox.run("A", Privileges.allow_const_read("A") # 4

Sandbox.run("$a") # raise SecurityError

Sandbox.run("A") # raise SecurityError




Interception of method calls using super on evalhook


Now, call to super methods are intercepted by evalhook and rejected by shikashi when appropiate

include Shikashi
#=begin
Sandbox.run("
class X
def system(*args)
super # raise SecurityError
end
end
X.new.system('ls -l')
", Privileges.allow_method(:new))


Refactor to use Ruby2Ruby on partialruby

Partialruby is the gem that emulates ruby using ruby to allow the changes to AST needed by evalhook for interceptions. Up to version 0.1.0, partialruby implements the emulation with abstract tree processing from scratch. Now, at released version 0.2.0, partialruby relies on more mature and stable gem Ruby2Ruby which converts AST to executable ruby source code

Links


martes, 19 de abril de 2011

Released ImageRuby - a flexible ruby gem for image processing

ImageRuby is a flexible gem for image processing, it's designed to be easy to install and use. The core of ImageRuby is written in pure ruby and the installation is as simple as execute "gem install imageruby". The API of the library take advantage of sugar syntax constructions specific of ruby language such as method chaining and blocks. e.g, the next code loads an image from "input.bmp", crop the rectangle from 0,0 to 49,49 (50x50 pixels), replace the orange color to red color, replace the black color to orange color and finally saves the image as "output.bmp"

require "imageruby"

ImageRuby::Image.from_file("input.bmp")[0..49,0..49].
color_replace(Color.orange,Color.red).
color_replace(Color.black,Color.orange).
save("output.bmp", :bmp)

Current Status

The ImageRuby had his first release (version 0.1.0) the last week with imageruby-c (extension to override a few methods with C methods) and imageruby-bmp (support for save and load images on BMP format),

You can install the gem by doing

gem install imageruby


The current version o imageruby has no dependencies, and it's installation should be as simple as that, optionally, you can install imageruby-c to improve the performance of draw and color_replace methods and imageruby-bmp to have support for bmp images.
The extensions take effect automagically by installing the gem, there is no need to do something specific in the ruby script (e.g. when install imageruby-c the methods will be optimized)

Goals of the project

  • Become synonymous of image processing in the ruby world, displacing other options in the "market" but re-using the existent stuff (giving credit, obviously)
  • Highlight deficiencies in the ruby language when used for heavy processing (e.g. draw method implemented in ruby is very slow)
  • Spike workarounds for the Ruby language issues
  • Spike "soft" dependency model or plugin gems to avoid the problems of "static" dependencies

Competitors

There is a range of gems for image processing on ruby, most of these are based on C libraries and implies the need for a compiler in order to install them. Some have "low level" bugs such memory leaks but most of them have many possibilities through their API

  • RMagick: Ruby wrapper of the archi-famous ImageMagick library. Has reported problems with low level memory handling and is not recommended for services environment (such as a Rails application) but is good for scripting
  • ImageScience: More stable alternative to RMagick, written on top of FreeImage and RubyInline (for more optimal code in C), an issue when using this library is the RubyInline dependency which implies the need for GCC installed on the system, something that is simple on Linux but is so tricky in other environments such Windows and Heroku
  • Camellia: A C/C++ library with a Ruby interface apparently directed to photo processing, there is no gem and the installation should be using the classic command sequence ./configure; make; make install.
  • Devil: Wrapper of C library of the same name. It has a wide range of operations over an image including blur and equalize. (see documentation)
  • Ruby-Processing: Well documented library mainly oriented to interactive media including video features. Has nice pencil functions
Future

New image formats

One of the next big steps will be develop decoder and encoders for common image formats, the encoders and decoders can be released as separated gems (like imageruby-bmp) without need for change imageruby core. In fact, anyone can make their own encoder or decoder.

Competitors become allies

Re-use the existing development around image processing in the ruby world, divided in two big groups: Interfaces and Implementations

Reusing interfaces implies the creation of "mocks" or "wrappers" of ImageRuby providing the same API of other library (e.g. RMagick or ImageScience) to reduce the learning curve of ImageRuby API and provide a method to replace that library with ImageScience transparently for code developed on top of that (e.g. a rails application using ImageMagick and then the ImageMagick gem is replaced by ImageRuby, app will not need to be modified)

Reusing implementation is about build features on top of other image libraries, but without setting a "hard" o "static" dependency with the library, but by creating a optional extension for ImageRuby (something like a port between libraries). Example: imageruby-devil.

Could be installed so:
gem install imageruby-devil

And used like
# is not necessary to make a explicit require of imageruby-devil
require "imageruby"

image = ImageRuby::Image.from_file("input.bmp")

image.color_replace!(Color.black, Color.purple) # use a method of ImageRuby
image.devil.blur(1) # use a method of Devil library on the image

image.save("output.bmp", :bmp)

Feedback for Ruby improvements

One of the goals of ImageRuby is to Highlight deficiencies in the ruby language , methods such draw or color_replace process hundred of thousands of pixels per image which in the ruby language implies still much more unnecessary rubymorphic calls just to access each bit of pixel data of the processed images. This produces a very very slow result, the current workaround for this issue is override the "heavy" methods with the imageruby-c extension. This was achieved as a optional extension to avoid the unnecessary hard dependency of have a compiler in the system, ImageRuby without imageruby-c installed HAS the draw and color_replace method, but with the imageruby-c gem installed, is much much more fast (near to 100x more faster)

Links

ImageRuby rdoc reference: http://tario.github.com/imageruby/doc/
ImageRuby GitHub site: https://github.com/tario/imageruby

lunes, 4 de abril de 2011

Evalhook v0.3.0 released and shikashi works on Heroku

Evalhook is a vital dependency of shikashi gem, provide the feature of execute managed ruby code in the way of allow "hooks" of common events like method calls, system calls, assignment of global variables and constants, etc...

Fixed Issues

With this release, problems of compatibility of actual version of shikashi (v0.3.1 at the moment of writing this post) were solved. The most important change was refactor of the hooking to make it more portable, less prone to error and implemented in pure ruby (the C extension was removed from the gem). this implies that now shikashi works on heroku and possibly in other environments

Update/Install

To update current installation of evalhook gem, run in terminal:

gem install evalhook

NOTE: To update the gem on heroku you must change Gemfile and Gemfile.lock files, then push this files to your application git repository, setting the content of Gemfile to something like that:

source :rubygems

gem 'shikashi'
gem 'evalhook', '0.3.0'
gem 'partialruby', '0.1.0'


Fundamentals

The most important concept behind evalhook is the AST hook technique which consist in change the Abstract Syntax Tree before it is interpreted and executed (view this other post for more info)

In previous versions of evalhook, the method to implement the AST hook was change the internal structures (representations of AST) on the version 1.8 of Matz Ruby Interpreter. This implementation breaks the encapsulation of MRI and depends on the specific implementation of that ruby interpreter making a very poor portability and being prone to error (e.g. the ruby interpreter crashes in heroku)

The reimplementation of the evalhook core is based on partialruby, a "para-interpreter" which executes ruby code by emulating it using ruby. Partialruby also exports the hooking services needed by evalhook thus avoiding the need for "hacks" to implement that service in a interpreter that does not have that. This concept is much more portable and theoretically should work with any interpreter of ruby 1.8 code.

Links

Shikashi Project description: http://tario-project.blogspot.com/2011/03/shikashi-flexible-sandbox-for-ruby.html
AST Hook fundamentals: http://tario-project.blogspot.com/2011/03/how-tree-hooks-works-and-what-are.html
Evalhook on Github https://github.com/tario/evalhook
Shikashi on Github: https://github.com/tario/shikashi
Heroku; http://heroku.com

sábado, 19 de marzo de 2011

How AST hook works and what are the current implementations

AST (Abstract Syntax Tree) hook is a technique to control the behavior of certain ruby node elements like method calls, global variables, etc...

First of all, certain ruby interpreters (including MRI) has an internal representation of AST which is called "node tree", each piece of code that is read by the interpreter for execution is parsed and is represented using a node tree structure. Then. this structure is read at runtime and "executed" by the interpreter core/vm.

So, the tree hook technique implies the modification of that tree before it is read and executed by the interpreter in order to perform certain "hooks". This is done by patching the tree and inserting new node elements.

Roughly, In the normal execution flow of the interpreter, the code is parsed and translated into the AST and finally executed by the vm which is part of the interpreter core:




When AST hook operate, it change the node tree after this is built and before it is executed



For example, the call node can be intercepted by changing_


To



The last node layout is a valid node tree structure too, which emulates the call after notify the event to a handler which decides to do

Current Implementations

MRI (Matz Ruby Interpreter) Hack

The first implementation of tree patching was a hack of the MRI which implements the patching directly to the node tree structure located in the memory of the ruby interpreter process, certain nodes pointer can be obtained in a C extension using a trick like that:



VALUE hook_block(VALUE self, VALUE handler) {
process_node(ruby_frame->node->nd_recv, handler);
}

And then, the node tree can be walked to make the patching. For example, patching the call node:



void patch_call_node(NODE* node, VALUE handler) {
NODE* args1 = NEW_LIST(NEW_LIT(ID2SYM(node->nd_mid)));
NODE* args2 = NEW_LIST(NEW_LIT(handler));

node->nd_recv = NEW_CALL(node->nd_recv, method_hooked_method, args1);
node->nd_recv = NEW_CALL(node->nd_recv, method_set_hook_handler, args2);

node->nd_mid = method_call;
}

Advantages
  • Easy to implement in C using interpreter code
Disadvantages
  • MUST be implemented in C to access the interpreter internal structures
  • Poor compatibility, the implementation works using internal structures of particular ruby interpreter version (e.g. wont work in ruby 1.9)

https://github.com/tario/evalhook /blob/v0.2.0/ext/evalhook_base/evalhook_base.c


Partial Ruby

In the previous detailed implementation, the main problem of the hack was the compatibility, because the tree patching is performed by changing internal structures of the interpreter which may or not may exists in the interpreter. This was done in that way because the interpreter does not expose any services in their API which serves to modify the tree (in fact, could not be any node tree in many interpreters).
The solution, is to create another interpreter which the needed services are exposed, using resources that exists in the environment: parser, api and VM.
Basically, PartialRuby parse the input ruby source file to an AST represented with ruby structures, after that, executes the ruby AST by emulating it using ruby and finally, pass the emulation ruby code to the real interpreter.






In this scenario, partial ruby expose in their API the services needed to perform the node tree patching

Advantages
  • Can be implemented in pure ruby
  • No access to ruby interpreter internals needed: compatibility granted
Disadvantages
  • Must re-implement a part of the Ruby VM

https://github.com/tario/partialruby