Sometimes simply checking the size of the whole directory is not enough, and you need more information, e.g. the whole directory has 3GB – but how much of it are those .avi files inside? The most flexible way for matching the files (or directories) is by using find command, so let’s build the solution based on this. For the actual counting of the file sizes, we can use du.

Consider directory as follows. Numbers in the brackets are the file sizes.

% tree -s filesizes    
filesizes
|-- [        116]  config.xml
|-- [        290]  config.yml
|-- [       4096]  dir1
|   |-- [       4096]  backup
|   |   `-- [      20583]  backup.tar.gz
|   |-- [          5]  blah space.yml
|   |-- [       2858]  script.php
|   `-- [       2858]  script.py
|-- [       4096]  dir2
|   `-- [       4096]  backup
|       `-- [      20583]  backup.tar.gz
`-- [       4096]  dir3
    `-- [       4096]  backup
        |-- [       4096]  backup
        |   `-- [      20583]  backup.tar.gz
        `-- [      20583]  backup.tar.gz
 
7 directories, 9 files

Size of the whole directory (-b option gives the size in bytes):

% du -sb filesizes
121227	filesizes

Total size of selected files

Let’s calculate the size of all the yml files. They are stored in different sub-directories and one of them contains a whitespace in the name (blah space.yml). find part of the command is straight-forward:

% find filesizes -type f -name '*yml'
filesizes/dir1/blah space.yml
filesizes/config.yml

Find all files (-type f) with the name ending with yml (-name ‘*yml’) in filesizes directory (that’s first argument – if it’s omitted then find will work from the current directory).
To join it with du command we will change the output of find to separate files it finds with NULL character (0) instead of default newline. We will also pass the list of files to du command, and instruct it that the items (files) are coming in the NULL-separated list from standard input:

% find filesizes -type f -name '*yml' -print0 | du -cb --files0-from=-
5	filesizes/dir1/blah space.yml
290	filesizes/config.yml
295	total

-c switch for du generates the last line, with the total count.

Total size of selected directories

Let’s try something a (little) bit more tricky – calculate the total size of all sub-directories, matched by a pattern. In our example, I would like to know the total size of all backup directories. Notice, that dir3 has backup sub-directory, which in turn has backup sub-directory. We don’t want to count those as two separate entries and sum them up by accident – that would bump our total size (you could possibly get the size of all backup directories bigger than the size of the whole top level directory!).

% find filesizes -type d -name backup -prune -print0 | du -csb --files0-from=-
24679	filesizes/dir1/backup
49358	filesizes/dir3/backup
24679	filesizes/dir2/backup
98716	total

This time we’re find-ing only directories (-type d) and once we find one, we stop and don’t go into it’s sub-directories (-prune). On the du side we’ve added -s switch that produces the summarized total for each item (directory) it gets.


One of the easiest methods of optimizing the execution of PHP script is by using a static variable. Imagine a function that does some complex calculations, possibly executes several database queries. If the return value from the function does not change too often and the function may be called several times during single request, it may be beneficial to cache the return value with static variable,
You could implement it within your complex function or with a wrapper function if you’re trying to cache the function you didn’t write.

function mymodule_complex_calculation() {
  static $cache;
  if (isset($cache)) {
    return $cache;
  }
  // Otherwise we need to perform our calculations
  // ...some complex calculation that returns 
  $cache = complex_calculation();
  return $cache;
}

This is a very simple pattern and cached data will “survive” only single web request but it may still be very effective, especially if function is called many times during the same request. Drupal implements a drupal_static() function that is a serves as centralized place for storing static variables. The advantage over using your own static variable is that data stored by drupal_static() can be reset. It means that, if other piece of code decides that some cached variable should be cleaned in mid-request, it can do so. Function definition is:

function &drupal_static($name, $default_value = NULL, $reset = FALSE)

You will need a unique name to address the data you want to cache. This will really become a key in the static array. The customary way in Drupal is to use the name of your function as $name, you can use PHP’s magic constant __FUNCTION__ that resolves to current function name. Now we can update our function as follows:

function mymodule_complex_calculation() {
  $cache = &drupal_static(__FUNCTION__);
  if (isset($cache)) {
    return $cache;
  }
  // Otherwise we need to perform our calculations
  // ...some complex calculation that returns simple value
  $cache = complex_calculation();
  return $cache;
}

You may notice that we don’t use drupal_static() to actually set the data – and we don’t need to! As soon as we assign anything to $cache, it will end up in our centralized static store.
drupal_static() has one disadvantage – it adds a new function call (plus execution of some logic inside drupal_static()) every time you use cache which increases the execution time. Compare it with the listing 1 – much simpler. This will add some overhead to each call. At this level we are talking about micro optimizing. It will make noticeable difference only if you function is called hundreds (or thousands) of times per request. If that’s the case, you can use advanced usage pattern for drupal_static. It merges functionality of our 2 functions:

function my_funct() {
  static $drupal_static_fast;
  if (!isset($drupal_static_fast)) {
    //17 is the default value in case the cache was empty
    $drupal_static_fast['cache'] = &drupal_static(__FUNCTION__,17);
  }
  $cache = &$drupal_static_fast['cache'];
  //do whatever you need with $cache ...

Pay attention to the fact that $drupal_static_fast is an array and the actual cache is within that array ($drupal_static_fast['cache']). You have to use an array, otherwise your reference to the static variable will be lost and each time my_funct() is called, it will need to retrieve a value using drupal_static() – therefore negating our optimization (see manual for the details).
I’ve added a benchmark for it to micro-optimization site. Assuming I’ve got the test right, you can see the 17x difference between accessing static variable vs calling drupal_static() function.


Apache log4j is one of the most popular frameworks used for logging events within Java code. Apache CXF on the other hand is one of the most popular framework to support communication using web services.

To monitor and debug Java application there might be a need to log inbound and outbound web service messages.
If we use two frameworks mentioned above org.apache.cxf.interceptor.LoggingInInterceptor and
org.apache.cxf.interceptor.LoggingOutInterceptor might be configured without any intergeneration in the code to log web service messages.

In the following configuration web service messages will be logged to console on INFO level.

log4j.rootLogger=WARN, console
 
## Output cxf logging information to console
log4j.logger.org.apache.cxf.interceptor.LoggingInInterceptor=INFO, console
log4j.logger.org.apache.cxf.interceptor.LoggingOutInterceptor=INFO, console
 
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.Target=System.out
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%6p | %d | %F | %M | %L | %m%n

More information related to logging and debuging in Apache CXF can be found here.


I was recently working on some MySQL 5.5 performance testing. I had to generate a lot of SQL queries I would then use for my testing. To make my tests repeatable I needed to hardcode the values for IDs. That is, I couldn’t simply use:

INSERT INTO TABLE_NAME SET column1 = 'value1';

because this query may generate a row with a different ID each time (depending on the value of auto_increment for the table_name). What I really needed is:

INSERT INTO TABLE_NAME SET column1 = 'value1', id=17;

To accomplish that I needed to know a value of the next auto increment ID for my table. Here is how can you retrieve that value in MySQL:

SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE TABLE_SCHEMA=DATABASE() AND TABLE_NAME='table_name';

This kind of “predicting” next ID will not work reliably and you should never use it in real application. It is safe in my case, as I’m only using it for generating data (queries) and I’m 100% that there are no concurrent connections to my database.


Setup

Below I will demonstrate how to use an excellent Dependency Injection component from Symfony 2, without using the whole Symfony framework. I will also use ClassLoader from Symfony and just for the sake of the demo I will integrate it with Zend Framework. The code requires PHP 5.3 (mostly for namespaces support).
Create a directory for your project. Inside create “lib” directory. Inside lib, create directory structure: Symfony/Component. Put ClassLoader code inside Component – the easiest thing to do is to clone if from Symfony github:

cd lib/Symfony/Component
git clone https://github.com/symfony/ClassLoader.git

The same goes for DependencyInjection component:

cd lib/Symfony/Component
git clone https://github.com/symfony/DependencyInjection.git

Finally download Zend Framework and put the contents of Zend directory into lib/Zend (so for instance Log.php file will be available in lib/Zend/Log.php).
The actual source code will go into “src” directory, which is a sibling directory of “lib”.

Configuring the ClassLoader

DependencyInjection uses namespaces for managing classes, so it needs to be registered with registerNamespace method. Zend Framework follows PEAR naming convention for classes – registerPrefix will do the work for us. Finally, I will register our own code that will be stored in src directory. I will use namespaces as well. Create a new file (let’s call it main.php) in the top-level directory:

require_once('lib/Symfony/Component/ClassLoader/UniversalClassLoader.php');
 
$loader = new Symfony\Component\ClassLoader\UniversalClassLoader();
$loader->registerNamespace('Zabuchy',__DIR__.'/src');
$loader->registerNamespace('Symfony',__DIR__.'/lib');
$loader->registerPrefix('Zend',__DIR__.'/lib');
$loader->register();
 
set_include_path(get_include_path().PATH_SEPARATOR.__DIR__.'/lib');

ClassLoader should now work just fine but we still need set_include_path so require functions inside Zend code will work correctly.

Dependency Injection container

Create a sample class that we’ll use for testing. I will call it Test and put it into Zabuchy\Techblog, which means it should be located at src/Zabuchy/Techblog/Test.php:

namespace Zabuchy\Techblog;
 
class Test
{
    private $logger;
 
    public function __construct($logger) {
      $this->logger = $logger;
    }
 
    public function run() {
     $this->logger->info("Running...");
    }
}

$logger should be injected using container – here is where we will use Zend_Log class. This one in turn requires a Writer, so we will create it as well. The rest of main.php will look like this:

use Symfony\Component\DependencyInjection\ContainerBuilder;
use Symfony\Component\DependencyInjection\Reference;
 
$sc = new ContainerBuilder();
 
$sc->register('log.writer','Zend_Log_Writer_Stream')
    ->addArgument('php://output');
$sc->register('logger', 'Zend_Log')
    ->addArgument(new Reference('log.writer'));
$sc->register('test','Zabuchy\Techblog\Test')
    ->addArgument(new Reference('logger'));
 
$sc->get('test')->run();

Running the code should give an output like below:

% php main.php 
2011-06-09T15:17:22+01:00 INFO (6): Running...

Complete code

For reader’s convenience I’m attaching a full source code – including a copy of Zend library and Symfony 2 components.


This is part 5 of the Java Technologies Integration tutorial. The purpose of this part is to provide updates to Hibernate configuration in existing project to make the maintenance of the project more straight forward.

Hibernate configuration files

The description of Hibernate configuration was described here: Java Technologies Integration tutorial – part 2 – Hibernate configuration. It is possible to remove hibernate mapping files and use automatic hibernate mapping.

In order to do so the following files should be changed as follows:

Hibernate.xml

sessionFactory bean

Change sessionFactory bean from org.springframework.orm.hibernate3.LocalSessionFactoryBean to org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean. Such change allows using Hibernate annotations automatically.

packagesToScan property

Remove mappingResources property including all the mapping files reference. Instead add packagesToScan property. As the value enter package name, where your classes with hibernate annotations are located, e.g, net.opensesam or to be more specific net.opensesam.entity. Hibernate will scan your package and find all the entities (classes annotated with @Entity) automatically, so you do not have to waste your time on configuration of entity classes.

It is possible to define one package to be scanned:

<property name="packagesToScan" value="net.opensesam" />

as well as multiple locations of Hibernate annotated files:

<property name="packagesToScan">
	<list>
		<value>net.opensesam.entity1</value>
		<value>net.opensesam.entity2</value>
	</list>
</property>

annotatedClasses property

Instead of scanning you can also use annotatedClasses property, but then you have to list all the classes annotated as @Entity explicitly. For example:

<property name="annotatedClasses">
	<list>
		<value>net.opensesam.entity.User</value>
		<value>net.opensesam.entity.Resource</value>
	</list>
</property>

Hibernate.xml file

Modified Hibernate.xml file of the project is presented below.

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://www.springframework.org/schema/beans  http://www.springframework.org/schema/beans/spring-beans-2.5.xsd">
	<!-- Hibernate session factory -->
	<bean
		class="org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean"
		id="sessionFactory">
		<property name="dataSource">
			<ref bean="dataSource" />
		</property>
		<property name="hibernateProperties">
			<props>
				<prop key="hibernate.dialect">${hibernate.dialect}</prop>
				<!-- In prod set to "validate", in test set to "create-drop" -->
				<prop key="hibernate.hbm2ddl.auto">update</prop>
				<!-- In prod set to "false" -->
				<prop key="hibernate.show_sql">false</prop>
				<!-- In prod set to "false" -->
				<prop key="hibernate.format_sql">true</prop>
				<!-- In prod set to "false" -->
				<prop key="hibernate.use_sql_comments">true</prop>
				<!-- In prod set to "false", in test set to "true" -->
				<prop key="hibernate.cache.use_second_level_cache">false</prop>
			</props>
		</property>
		<property name="packagesToScan" value="net.opensesam" />
	</bean>
</beans>

Mapping files

Having the changes introduced you no longer need Hibernate mapping files (*.hbm.xml). Remove them from src/main/resources/resources/hibernate/ directory.


Below is the fastest way to install the latest trunk version (currently 1.9.6) of Battle of Wesnoth on your Ubuntu Natty Narwhal. Note: it will probably install a bit too many packages than you really need.

sudo apt-get install automake debhelper libboost1.42-all-dev libboost1.42-dev libboost-dev libboost-iostreams-dev libboost-regex-dev libboost-serialization-dev libboost-test-dev libdbus-1-dev libfreetype6-dev libfribidi-dev libghc6-pango-dev liblua5.1-0-dev libpango1.0-dev libsdl1.2-dev libsdl-dev libsdl-image1.2-dev libsdl-mixer1.2-dev libsdl-net1.2-dev libsdl-ttf2.0-dev python-support quilt scons subversion && svn co http://svn.gna.org/svn/wesnoth/trunk wesnoth && cd wesnoth && scons

That’s all – keep in mind that this may take a long time to execute… To play the game, you don’t need to install it, simply type:

./wesnoth

I’d like to share a quick guide on how to set up master/slave replication for the MySQL 5.5 server. The procedure below should be used for development/testing only. If you want to create a production-ready setup, you should follow instructions from MySQL official documentation or use MySQL server packaged by your favorite Linux distribution.

1. Download latest MySQL 5.5 from http://dev.mysql.com/downloads/mysql.

2. Follow the installation instructions on http://dev.mysql.com/doc/refman/5.5/en/binary-installation.html or instructions below if you don’t care about the secure setup (e.g. you are only using this MySQL installation for testing). You should also follow my instructions if you want to avoid conflicts with the MySQL you may have installed from your Linux distribution package. Do not follow them for the production setup.

  • switch to root
  • on ubuntu/debian run sudo apt-get install libaio1
  • cd /opt
  • download mysql tarball
  • unpack it tar -zxf mysql-5.5.3-m3-linux2.6-x86_64-icc.tar.gz
  • ln -s mysql-5.5.3-m3-linux2.6-x86_64-icc mysql
  • cd mysql
  • scripts/mysql_install_db --basedir=/opt/mysql --datadir=/opt/mysql/data
  • bin/mysqld_safe --defaults-file=support-files/my-medium.cnf --user=root --basedir=/opt/mysql --datadir=/opt/mysql/data

You should be able to connect as a root without the password

% mysql -u root -h 127.0.0.1
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.3-m3 MySQL Community Server (GPL)

To shut down your MySQL use

mysqladmin -h 127.0.0.1 -u root shutdown

Now, repeat the procedure on the second machine and we can move on.

3. Set up the replication.
Master will already be configured for the replication, as the whole necessary setup (log-bin,server-id) is already included in support-files/my-medium.cnf. You will still need to create a replication user on the slave:

CREATE USER 'repl'@'%' IDENTIFIED BY 'slavepass';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';

Go to your slave and set in the /opt/mysql/support-files/my-medium.cnf:
server-id=2 (replace the default id 1 with 2)

Make sure you can connect from the machine where slave is installed to the master MySQL server, using ‘repl’ user.

CHANGE MASTER TO MASTER_HOST='192.168.1.10', MASTER_USER='repl', MASTER_PASSWORD='slavepass';
START SLAVE;

That’s it! Let’s test it, on your master issue:

CREATE DATABASE symfony2;
USE symfony2;
CREATE TABLE test_table (a INT);
INSERT INTO test_table VALUES (1);

Corresponding table should be created on slave and should contain one record. Now, let’s test temporary disabling replication log on master:

SET sql_log_bin=0;
INSERT INTO test_table VALUES (2),(3);
SET sql_log_bin=1;
INSERT INTO test_table VALUES (4),(5);

You table on master should contain 5 rows but the one on the slave only 3 – the values 2 and 3 should be missing.


I wanted to change the way teasers are displayed – disable this information about the node submission: published by admin on Fri, 04/22/2011 – 17:25.

As it turns out, there is a variable that Drupal checks when generating submission information. The variable is set per each content type in a format node_submitted_<NODE TYPE>, e.g., node_submitted_page and should be simply set to TRUE or FALSE. Here is the code I’ve added to my micropt module, to micropt.install

function micropt_install() {
  variable_set("node_submitted_micropt", FALSE);
}

You can change the value of the variable using drush or directly in the database:<p></p>

UPDATE `variable` SET `value` = 'b:1;' WHERE `name` LIKE 'node_submitted_micropt'

“b:1;” is a serialized value for TRUE, for FALSE use “b:0;”. If you do so, don’t forget to clear the cache!


Say you would like to transfer set of files from location A to B. The source directory is really big but you are only interested in transferring some files (say *.png only). rsync is a good tool for the job. There are two methods I know that can be utilized.

find + rsync

Use find to select the files and then send them over with rsync:

find source -name "*.png" -print0 | rsync -av --files-from=- --from0 ./ ./destination/

This will create the minimum directories (and sub-directories) required for your files.

rsync only

Use rsync command only with –include and –exclude options.

rsync -av --include='*/' --include='*.png' --exclude='*' source/ destination/

The difference from the previous command is that rsync will create all the directories from source in destination, and then copy *.png files. It will potentially leave your destination filled with empty directories – which may or may not be what you want.