Tuesday 31 March 2015

Switching projects

Project Switch
Project Progress: PERL5 PCRE / Stack-less Just in time compiler
This is an update about my project progress in SPO600, I have switched projects and made some steps forward.

Perl5

In my previous post I ruled out some areas in Perl5 and was heading towards the regular expression engine as an area to optimize. Well I got a response from the mailing list saying the regular expression engine should be fine and does not really need any simple optimizations. I was kind of stubborn in thinking I could find something in perl and I kept looking when nothing obvious was there. In hindsight I should have stopped and tried out different packages sooner.

PCRE and sljit

After a suggestion from my professor I have looked at PCRE(Perl compatible regular expressions) This small library deals with parsing regular expressions using similar semantics as perl. I downloaded the source via subversion
svn co svn://vcs.exim.org/pcre2/code/trunk pcre
and found a folder in the source called sljit. There are many files in that directory with architecture specific code. They have some for x86, some for ARM, some for Sparc and more, this peaked my interest so I looked into it. I found that sljit is a stack-less just in time compiler which is cpu independent, more info here. Specifically in pcre it is part of a pcre performance project which uses sljit to improve the pattern matching speed of pcre.

Moving forward

I have just found all this recently, so for now I will be looking into any functions that can be ported over to another architecture or any areas that look promising for optimization. I hope to soon zero in on one particular location, and actually get something done.
Thanks for reading

Tuesday 24 March 2015

Project updates

Project Progress
Project Progress: PERL5
This is an update about my project progress in SPO600

Ruled out

In my previous post I mentioned three areas, The tail call optimization, Regex super - linear cache and inline assembly. After contacting the perl5 community on irc it seems that the tail call optimization portion should not be there, therefore it has been ruled out. I mailed the mailing list about any suggestions regarding the inline assembly portion and any other ideas they had but I have not received word back as of now. I kind of ruled out this assembly code for now unless further information comes up to suggest it has potential.

Moving on

In proceeding with this project I have not narrowed down or progressed as much as I would have liked to by this time, I am just spinning my wheels. For now I will work at understanding how I could improve the regex engine while also exploring other packages to see if I can find some more straightforward things to accomplish. Ideally I would like to find places I could use inline assembly optimizations or compiler intrinsics, both of which we have been looking at recently in the SPO600 course.

Sunday 15 March 2015

Project Progress

Project Progress
Project Progress: PERL5
This post is about a project I am starting in my SPO600 class that requires me to optimize a portion of the lamp stack.

Areas to optimize

I have chosen Perl5 as the package that I will be working on and have found a few areas that may be good areas to optimize and make changes.
The first two areas I found using the perl todo list.
The first was tail call optimization:
Seen at line 1130.
This would essentially have me find areas where tail call optimization is possible and rewrite them to implement it. Here is a link that explains what tail call optimization is: TCO

The second area I found was in regards to Perls regular expression engine.
Seen at line 1154
In their engine certain regular expressions end up taking exponential time. They have a workaround for this called super-linear cache but they say the code has not been well maintained and could use improvement. I found the location of this problem in the source by grepping for the keyword 'super-linear', found at regexec.c.
It seems like this could be an area for optimization although I am not very confident about how I would attempt this or what I would change to improve it because I do not have a strong knowledge of how a regular expression engine works.

The final one I found by looking through the perl 5 git repository(Instructions here), I found some sections of inline assembly by grepping for the keyword 'asm' using 'grep -r asm ./*' these sections were in a file called os2.c:
my_emx_init()
my_os_version
These functions may potentially be able to be ported to aarch64 syntax. I am a bit uncertain about this area because I am not sure what this code does or if it is important or not.

Why Perl?

I chose Perl for my project because the community seems really clear and organized. They have a todo list with various tasks which is very useful and as you can see above it helped me a lot with regards to finding areas to work on. Also they have a very active community, In their mailing list archive, there are daily messages which makes me confident that if I need help or need to ask a question I won't be waiting for extended periods of time.

Proceeding

Looking at my 3 options I believe the tail call optimizations might have a large impact depending on how many areas I can find. I would like to implement some code involving the aarch64 platform because that would relate to the SPO600 course the most but I am uncertain about the inline assembly code that I have found so far. The regular expression area seems really interesting, but I am afraid it would not be feasible given the time I have, it is something I will definitely consider if my project doesn't go as planned.
Proceeding with this project I plan on starting to work out how to apply the tail call optimizations while I engage with the upstream community about which direction is the best for them and for me. I also plan on benchmarking Perl on x86 and aarch64 to see if I can find any further areas or functions that may let me perform a platform specific optimization.

Perl Upstream

Perl has a relatively straightforward guide on their website here.
To summarize, if you have a patch either use perlbug or send it to perlbug@perl.org. Once the patch has been processed it will be posted on the mailing list for discussion, you are encouraged to join the discussion and promote your patch. They recommend using git, You can get the source by using 'git clone git://perl5.git.perl.org/perl.git perl', Once you make changes you can use git diff to make a patch, this compares your branch and the main branch to produce the patch.

Conclusions

This project has made me the most nervous of any project I have had so far. It is filled with uncertainties, a couple weeks ago I was uncertain I would even find anything to work on but eventually I did. Now I am uncertain on which direction to go and whether or not my contributions will be accepted. Regardless of what happens it is a great learning experience and I now appreciate the complexity of large projects like Perl or other packages in the lamp stack.

Monday 2 March 2015

Device Access

Device Access
Device Access
This post is regarding a presentation that I did in SPO600 where I talked about device access in low level languages.

Access using I/O Ports

I/O ports: These look just a memory cell to a computer but they connect any data written or read from it to a device that is connected to the computer.
  MOV DX, 0487 ; Port number.
  MOV AL, 1    ; Number to write to the port.
  OUT DX, AL   ; Write to the port.
  IN DX, AL    ; Read from the port.
In this case DX takes in the port number and AL takes in the number you are writing to the port. OUT will write the number stored in AL to DX and IN will read from the device and store results in AL.
Danger:
Accessing devices in this way is dangerous. Every port cannot be accessed in the same way, Some ports are read only, and some are write only and some are both. If you don’t know exactly what port you are using you risk damaging any device that is accessing that port.

Access using Interrupts

An alternative method is using software interrupts to gain access to a device, This is often used in more complex devices such as the mouse in order to simplify accessing it.
  MOV AX, 0 ; Access subfunction 0 of int 33h 
  INT 33    ; Make the interrupt call
In the Example we use Interrupt 33 which provides access the mouse, and moving a number into AX will allow you to access a specific function of the mouse (function 0 returns a value which indicates if a mouse has been detected/installed). There are many, many more sub-functions available here that allow you to access many different parts of the mouse.

Platform / architecture Issues

I/O and device instructions are very processor dependent, due to the details of how a processor moves data in and out, most of the source code is coded specifically for a platform. For example in the linux kernel they have many different files including io.h that are stored in folders specific to the architecture.
Arm: io.h
x86: io.h

Conclusions

I was unable to find to much information about this topic and Chris Tyler, my professor for SPO600, pointed out to me that this is because all the device access SHOULD be handled by the operating system unless you are writing for device drivers or doing embedded programming.

Resources

  • 1 A portion of a book explaining IO ports and interrupts
  • 2 A tutorial explaining IO ports and software interrupts
  • 3 One chapter of a book about device driver programming detailing IO ports and how they are used with devices
Thanks for reading