November 2009 Archives

CHI looks interesting - one Perl module interfacing to any kind of cache. The "L1 cache" + "mirror cache" is a technique worth looking at and it adds more value to the module beyond just an unified interface.

Systems that Never Stop

| No Comments | No TrackBacks

Thanks to Yuval Kogman for pointing out this great talk - Systems that Never Stop (and Erlang).

I wrote recently about feedback.cgi on Bratislava Perl Mongers page. It turned out to be useful only to deliver more spam. Today I've replaced it with a JavaScript bookmarklet to the new evil thing from Google - Sidewiki. So everyone has still a chance to annotate the pages and I have one less thing to maintain. Win-Win.


Btw this is how one can get the sidewiki RSS feed url for http://bratislava.pm.org/ page:

perl -MURI::Escape -le 'print "http://www.google.com/sidewiki/feeds/entries/webpage/", uri_escape($ARGV[0]), "/default?includeLessUseful=true"' http://bratislava.pm.org/

We got a new smoking regulations:


Please kindly note that, due to a decision by consensus - which was made
at the residents meeting yesterday- from now on *SMOKING is STRICTLY
FORBIDDEN* in the whole building (it's also not allowed to smoke on the
balcony anymore).

*SMOKING is ONLY allowed in front of the Spar Supermarket (where the ash
tray is located)!!!*

Thanks for your cooperation!


This reminded me if this part from IT Crowd:

There is no Chrome "OS"

| No Comments | No TrackBacks

Yesterday I was watching "Google Chrome OS Open Source Project Announcement":


The "Chrome OS" got demystified, well at least for me. It's nothing more, but also nothing less than a project to throw away some conventions about current systems. Or some people say returning back to the thin client era. Basically "Chrome OS" is (will be?) a damn fast init loader. Init that will load just the things needed to run a browser. Then it is up to the browser to fulfil the consumer needs.

The challenging part will be to add all the HTML5 and other features needed to make browser have all the power of classical desktop applications. Like access to HW acceleration, offline storage, popups, etc. Another challenging part will be to convince people to write real web apps using these features.

There are 2625 XML modules atm on CPAN. (`zcat 02packages.details.txt.gz | grep -E '\bXML\b' | wc -l`) So why not create another one? There are number of modules that are trying to map XML structure to Perl data structures. This can never be perfect and sometimes needs custom hooks to adapt. What about the other easy (easier) way around? Serializing Perl data structures to XML? I've made an experiment - Data::asXML. And here is how it looks like:

"Some text"
<VALUE>Some text</VALUE>


[ 1, 22, 333, 4444 ] <ARRAY> <VALUE>1</VALUE> <VALUE>22</VALUE> <VALUE>333</VALUE> <VALUE>4444</VALUE> </ARRAY>
{ heh => 'and', so => [ 'what', '?' ] } <HASH> <KEY name="heh"><VALUE>and</VALUE></KEY> <KEY name='so'> <ARRAY> <VALUE>what</VALUE> <VALUE>?</VALUE> </ARRAY> </KEY> </HASH>

Quite verbose, isn't it? But it should be clear how the data gets serialized to and from XML. My main reason was to have an easy and clear way to pass data to XML::LibXSLT. That meant building a DOM tree with XML::LibXML. As you can see the element names are static - HASH, ARRAY, VALUE. All that is changing are the attributes. The result DOM structure can be easily matched via XPath which is important for XSLT. Here are some examples:

/HASH/KEY[@name="key"]/VALUE
/HASH/KEY[@name="key2"]/ARRAY/*[3]/VALUE
/ARRAY/*[1]/VALUE
/ARRAY/*[2]/HASH/KEY[@name="key3"]/VALUE

I got a it little bit further than just creating a DOM, as now Data::asXML can deal with double references and circular references. Just have a look at how this data is encoded:

my (%hash1, %hash2, %hash3, $scalar_ref3);
%hash1 = (
	'info' => '/me hash1',
	'next' => \%hash2,
	'prev' => \%hash3,
	'more' => \$scalar_ref3,
);
%hash2 = (
	'info' => '/me hash2',
	'next' => \%hash3,
	'prev' => \%hash1,
	'more' => \$scalar_ref3,
);
%hash3 = (
	'info' => '/me hash3',
	'next' => \%hash1,
	'prev' => \%hash2,
	'more' => \$scalar_ref3,
);

print Data::asXML->new->encode([\%hash1, \%hash2, \%hash3])->toString, "\n";
<ARRAY>
	<HASH>
		<KEY name="info">
			<VALUE>/me hash1</VALUE>
		</KEY>
		<KEY name="next">
			<HASH>
				<KEY name="info">
					<VALUE>/me hash2</VALUE>
				</KEY>
				<KEY name="next">
					<HASH>
						<KEY name="info">
							<VALUE>/me hash3</VALUE>
						</KEY>
						<KEY name="next">
							<HASH href="../../../../../../*[1]"/>
						</KEY>
						<KEY name="prev">
							<HASH href="../../../../*[1]"/>
						</KEY>
						<KEY name="more">
							<VALUE type="undef" subtype="ref"/>
						</KEY>
					</HASH>
				</KEY>
				<KEY name="prev">
					<HASH href="../../../../*[1]"/>
				</KEY>
				<KEY name="more">
					<VALUE href="../*[2]/*[1]/*[4]/*[1]"/>
				</KEY>
			</HASH>
		</KEY>
		<KEY name="prev">
			<HASH href="../*[2]/*[1]/*[2]/*[1]"/>
		</KEY>
		<KEY name="more">
			<VALUE href="../*[2]/*[1]/*[2]/*[1]/*[4]/*[1]"/>
		</KEY>
	</HASH>
	<HASH href="*[1]/*[2]/*[1]"/>
	<HASH href="*[1]/*[2]/*[1]/*[2]/*[1]"/>
</ARRAY>

The href attribute holds a relative XPATH to the element that should be referenced. If anyone will ever find it useful, I don't know as I made it so far just to try if its possible and to have some fun with XML and Perl data structures...

Does anyone knows of a Perl MVC implementation that is based on some event loop? Having non-blocking IO (filesystem/database) in web request handling would be nice to have. That will enable to have just one process per CPU. The IO is the only reason why "one process is NOT enough for every CPU"

perl -le '$s=3000; $t+=($s*=1.032)*12 for 30..65; print int($s),"€"; print int($t),"€";'
int($sum),"€";'
9232€
2447289€

2.4mio€ => more than fair price (3000€) for one life of currently 30 years old person living in Austria where inflation rate was 3.2%.

s/3000/$dream_salary/
s/1.032/$inflation_in_your_currency/
s/30/$your_age/

To get the price for your whole life...

Stupid heh? But fun!

Or may be not. Just compare the total sum next time you read that goverment wasted 100mio€ and you will know how many men-lifes they threw away.

Beyond the key-value model

| No Comments | No TrackBacks
redis-logo.png

Redis - A persistent key-value database with built-in net interface written in ANSI-C for Posix systems. ... see all the commands supported by Redis to get the first feeling

Looks interesting.

(read must be enough for most of the cases)

Why? Because most of the cases = one cpu, one disk and one ram. Running more Apache children on such a machine is just a waste of resources. The cpu or disk can not work better than 100%. Only sw developers can do better then 100% becase they have to! ;-)

The "normal" approach is to fork as many mod_perl Apache childs as the ram allows to be able to handle as many requests as possible. This was due to the fakt that Apache was processing the whole request(s) from client. This includes the connect, receive request, process request and send response. If we could strip this to the "process" step then we need just one process keeping busy all the resources of hw.

The way how to outsource the client communication somewhere else is to have a light weight reverse proxy. ba.pm.org is using nginx to deliver all the static content and proxy to Apache the only dynamic part and that is the feedback form submit processing.

What nginx does is that it fetches all the client request (can be big and long lasting in case of file uploads) and only when it is finished passes it to the Apache. Then when Apache is finished it pass the whole response to nginx and then it's ready to process another request. While nginx (slowly) let the client download the response.

Nginx is made lightweight, fast to deliver either static content or proxy for the dynamic one.

Let's have a look at the nginx config of ba.pm.org:

# bratislava.pm.org
server {
    listen       80;
    server_name  ba.pm.org;

    access_log /var/log/nginx/bratislava.pm-access.log;
	
    location / {
        rewrite  ^(.*)$  http://bratislava.pm.org$1  permanent;
    }		
}
server {
    listen       80;
    server_name  bratislava.pm.org;

    access_log /var/log/nginx/bratislava.pm-access.log;
	
    location / {
        root   /data/www/pm;
        index  index.html index.htm;
    }
	
    location /cgi/cgi-bin/ {
        proxy_pass        http://internal-hostname:81;
        proxy_set_header  X-Real-IP  $remote_addr;
        proxy_set_header Host $http_host;
        proxy_set_header X-Forwarded-Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
   }

    location ~* \.(jpg|jpeg|gif|css|png|js|ico|rdf)$ {
        root   /data/www/pm;
        expires           1h;
    }
}

The site is ready to handle 1024 simultaneous connections and to receive a lot of feedback. ;-)

So? If you are not "everyone" and the Perl is doing other things than local IO, one Apache child is not enough. But nginx helps to decrease this number for a low price.

7. XSLT hammer

| No Comments | No TrackBacks

There are couple of transformations done using XSLT for ba.pm.org page. Using XSLT from one XML file the RSS feed (.rdf) and events page (.tt2) is generated. From another XML file the who is who page and JavaScript for Google maps is generated. Thanks goes to potyl for having patience and time doing these!

From Makefile:

html/en/news.rdf: etc/events.xml tt/dtd/events-1.0.dtd xslt/events-to-rdf.xslt
	xmllint --valid --noout $<
	xsltproc --stringparam lang en --nodtdattr xslt/events-to-rdf.xslt $< > $@

tt/events.tt2-en: etc/events.xml xslt/events-to-html.xslt tt/dtd/events-1.0.dtd
	xmllint --valid --noout $<
	xsltproc --stringparam lang en --nodtdattr xslt/events-to-html.xslt $< > $@

tt/who.tt2: etc/who.xml xslt/who-to-html.xslt
	xmllint --valid --noout $<
	xsltproc --nodtdattr xslt/who-to-html.xslt $< > $@

src/js/06_mongers.js: etc/who.xml xslt/who-to-js.xslt
	xmllint --valid --noout $<
	xsltproc --nodtdattr xslt/who-to-js.xslt $< > $@

So what so good about XSLT? It is made to do XML transformations. From XML to anything as you've seen in the above examples. Here is a collection of links pointing to XSLT related pages:

XSLT stylesheet can include other stylesheets with matchers, can override the ones defined earlier and it can do some other nasty hacks. You can do pretty much anything, although it may not be the best idea. Like in Perl...

6. feed us back

| No Comments | No TrackBacks

First of all I have to say the form feedback was a big mistake. Or a waste of time? What ever. It is there in ba.pm.org page for >1y now and no one ever used it, besides the two times when I gave a presentation about ba.pm.org on TCPW 2008 and YAPC::EU::2009. Actually it was used but by spammers :-/ I wish all the spammers die with a long and painful dead, sorry but ... but they deserve it!

Back to the feeback implementation. It consists out of four parts - html forms, cgi to process the post, two response pages and a JavaScript for js enabled browsers.

HTML markup is generated from feedback-static.yaml using HTML::FormFu in generate-form.pl to feedback.tt2 and feedback-static.tt2 to be included in the page templates:

# from Makefile
tt-lib/forms/feedback.tt2: etc/forms/feedback.yaml
	script/generate-form.pl --in $< --out $@

tt-lib/forms/feedback-static.tt2: etc/forms/feedback-static.yaml
	script/generate-form.pl --in $< --out $@

feedback-static.tt2 has element id-s and is used for contact page. feedback.tt2 is included in all page headers and is hidden/shown when onClick even is triggered via feedback link in the menu.

About the feedback.cgi I already wrote in a blog entry on use.perl.org.

And about the JS recently in "scraping my self".

4. less can be more

| No Comments | No TrackBacks

Just look at the google.com and compare it to for example yahoo.com :-)

But I'm not going to compare those, rather I look where less is more at ba.pm.org. This is the case of CSS and JavaScript. To optimize this there are two transformations using YUI Compressor. From Makefile:

YUICOMPRESSOR=java -jar script/yuicompressor-2.4.2.jar

JS_FILES_TO_MINIFY=\
	src/js/01_jquery-1.2.6.js \
	src/js/02_jquery.sprintf-0.0.3.js \
	src/js/03_Gettext-0.04.js \
	src/js/04_i18n-0.04.js \
	src/js/05_thickbox-3.1.js \
	src/js/06_mongers.js \
	src/js/07_maps.js \
	src/js/08_dropShadow.js \
	src/js/08_jquery.cookies.2.1.0.js \
	src/js/09_feedback.js \

CSS_FILES_TO_MINIFY=\
	src/stylesheets/01_main.css \
	src/stylesheets/02_thickbox.css \

html/js/script.js: ${JS_FILES_TO_MINIFY}
	cat ${JS_FILES_TO_MINIFY} | ${YUICOMPRESSOR} --type js --charset UTF-8 -o $@

html/stylesheets/style.css: ${CSS_FILES_TO_MINIFY}
	cat ${CSS_FILES_TO_MINIFY} | ${YUICOMPRESSOR} --type css --charset UTF-8 -o $@

There are two more things to compress, but no-one had any touits to do it for ba.pm.org page. Still. One is the static html. It can be optimized by htmlclean. The other one are the picture files. There are plenty of tools to do so - pngout, optipng, pngcrush, jpegtran, pngquant, ...

Well and why? To save bandwidth, storage, download times => save money and at the same time enhance the user experience.

3. scraping my self

| No Comments | No TrackBacks

While thinking about how do I pre-generate the "Email send, thank you." and "Email send error, please try again later." feedback AJAX responses I realized I don't. :-) I use the statically generated response pages and just web-scrape them.

The interresting part from feedback.js:

// setup feedback submit event
$('#feedback_form').submit(function() {
    $.post(
        $('#feedback_form').attr('action'),
        {
            name: $('#formName').attr('value'),
            email: $('#formEmail').attr('value'),
            subject: $('#formSubject').attr('value'),
            message: $('#formMessage').attr('value')
        },
        function (data) {
            $('#content_inside_main .error')
                .replaceWith($(data)
                .find('#content_inside_main .error'));
            $('#content_inside_main .info')
                .replaceWith($(data)
                .find('#content_inside_main .info'));
        }
    );

    $('#feedbackLink').click();
    
    return false;
});

First the submit of a form is hijacked and instead an AJAX post is issued. The CGI that is handling the POST is redirecting to either email-ok.html or email-fail.html. This pages are generated for non-JavaScript capable (disabled) browsers. After the AJAX call is successful function (data) {}; is called. The data variable is filled with the HTML. Then just content of #content_inside_main .error and #content_inside_main .info elements is replaced with the content of the same elements from data.

Fun huh? In this case yes, but in other the same technique is saving >300MB of content that will have to be pre-generated. But let's show that case some other time.

2. a hook for you

| No Comments | No TrackBacks

Em, I mean a commit hook. What can such a thing do for you? Let's see by example. There are two usages for ba.pm.org.

1st to send out colourful notification email when someone do a commit to all interested parties.

2nd when the commit is made to the place where the final page is stored, to update the web folder = publish live the changes.

post-commit:

#!/bin/sh

REPOS="$1"
REV="$2"

# send diff email after commit
/usr/bin/svnnotify -r $REV -C -d -H HTML::ColorDiff \
    -p $REPOS -f 'svn@cle.sk' \
    --subject-prefix "[$REV] `svnlook author -r $REV $REPOS`" \
    --to-regex-map 'someone@somewhere.net=^(www/pm|www/tt-pm)' \
    --to-regex-map 'someoneelse@somewhereelse.net=^(www/pm|www/tt-pm)' 

# update bratislava.pm.org site after commit
if [ "`svnlook changed -r $REV $REPOS | grep '^....www/pm'`" ]; then
    sudo -H -u localuser /path/to/update-ba.pm.org;
fi
hello-world-makefile.png

Everyone knows make and Makefiles? At least in Perl world everyone is using it when installing ExtUtils::MakeMaker CPAN distributions. It is present on most of the systems, so why no to find it some more usage? (read save time, ease the work)

Makefile was originally created to help compiling and linking C source code. So? So the point is that we don't have just to compile source code, but we can use Makefile-s to process any kind of dependency based files chain. Inside the Makefile there is always a target file that should be generated together with dependency files needed for generation and a set of commands to perform the task. In addition targets can be made PHONY which means that the target commands will be always executed. This is most often used for "clean" target - `make clean`, which should removed all temporary build/generated files and tidy-up the folder.

The PHONY functionality can be used beyond housekeeping do define set of commands (or a library of commands) that make sense for current folder. For development project this can be `make upload`, `make deploy` or `make ajoke` or what ever comes in handy.

The Makefile of ba.pm.org has couple of targets. Transforming .po files to .js (using po2json), .xml to .tt2, .rdf or .js (using XSLT), minifying js and css (using yuicompressor), etc. There are also a couple of PHONY targets like 'all' (to build the page), 'test' (to test the xml and site linking), 'clean' + 'distclean' to tidy-up.

And that is, to break the blogging best practises and publish the whole blog series of 9 in one week. Instead of keeping them for periodic publishing.

And why? Because the promise to publish those is preventing me from writing about different things that I'm currently dealing with. So let's have this "jobs" quickly done and move on further.

5. dôveruj ale preveruj

| No Comments | No TrackBacks
tap.png

"5. trust but verify" - some time ago I've promised to do blog series and a lot of water went down the river since then so it's really time to full fill the promise.

Why starting with part 5? As there was just one comment, from andy.sh that he likes the $title, on the schedule announcement, I assume it is the most interesting one. :-)

All (most?) CPAN authors are writing tests. Tests are cool, let us sleep well, let us discover "oh I forgot" thinks.

While code is once written and remains that way for a while, most pages these days are generated on the fly to put personalized information, banners or what ever fancy shiny stuff. This may make the testing a bit more difficult. Having the pages statically generated allows to easily test and most important validate pages still on dev system.

Let's have a look at vxml script. It will validate any xml file or a folder with .html, .xml, .rdf files using XML::LibXML for validation. In addition all internal links (a href, img src) in HTML files are extracted and tested if their target files exists. The output of testing is TAP. What else will be better if TAP can test anything? ;-).

While ba.pm.org can be generated statically some pages can not, especially the once requiring users to login. Lot of them have a high load and testing every request will kill (already loaded) machines. One technique is to take randomly only every Xth request and test that one as the part of response to the client. When put to PerlCleanupHandler than it has no effect on user experience.

Updates

Subscribe to the blog updates with an email:

If you like it, share it.

Pages

About this Archive

This page is an archive of entries from November 2009 listed from newest to oldest.

October 2009 is the previous archive.

December 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.