Archive

Archive for the ‘Product Engineering’ Category

Continuous performance testing

January 23, 2010 Indraneel Leave a comment

Once I asked an engineering manager what they were doing to ensure that the product they were building would be able to endure peak load. After a blank stare, all I got was that they were adding features and would worry about performance later, when they development hubbub settled down.
After the product was built one of the team members ran a performance test and found out a lot of the user actions took an inordinately amount of time to complete even under a concurrent user load of just 5.
Leaving out performance considerations till the end of the development cycles is just dangerous. I have seen teams who were forced to make architectural changes right before the release and then scurry to meet release deadlines, because they never found out what the performance impact of an architectural decision they took 4 sprints ago.
That said, in one of the product teams that I was working with, found out that some of the user actions was taking an inordinate amount of time because users complained about it. This is perhaps the last thing that you want. But anyways, we agreed that we needed to know performance impact of each new feature we pushed out and each enhancement we made. We were doing weekly releases and so we knew that it would really require quite a bit of automation. Choosing performance scenarios, creating scripts, running them and analyzing the results was too much to be done manually every week.
Enter Bamboo, JMeter and Gruff. Bamboo would run the performance scripts created in JMeter along with the build every Tuesday before the build got pushed to the staging sever. We can store the results for the scenarios in MySQL and create graphs with Gruff and show them as Bamboo “artifacts” for the Tuesday builds. Yes, I know what you are thinking. Maybe we should have used the rrdtool instead of MySQL and Gruff. We tried using rrdtool and gave up. Mainly because we wanted to keep all our performance data, not just a window of it. Since we were not running very heavy loads we decided to generate the load from the Bamboo machine itself since it was a pretty decent machine. We kept the target machine isolated to make sure nothing else was done on it except performance tests.
A ruby script would trigger the JMeter scripts just after the build is complete, bundled and deployed on the target sever by capistrano. For result aggregation we used the XSLT provided by JMeter and tweaked it to produce XML instead of HTML. The aggregated data went into the MySQL database. And then we used Gruff to create the graphs and store them as Bamboo artifacts.
Here is a little sample of what it looked like in the end.

Performance test with Jmeter, Bamboo and Gruff

With the performance tests running with every Tuesday builds, it was now easy to see the performance impact of adding the new features or enhancements.

Pre-loading the distributed cache

January 22, 2010 Indraneel Leave a comment

We have been using memcache for distributed caching in one of the products that we were working with. We had 4 application servers with 2GB of memory dedicated to memcache in each of the machines. A lot of times when we delivered new functionality or a patch, we needed to restart the memcache servers. And then for a day, the site thrashed. We got horrible response times for a lot of our pages which were supposed to be cached. We digged a little deep and found out that the cache loader which was supposed to pre-load the cache needed about 50 hours to complete. No wonder we were getting horrible response times in pages that were supposed to be cached. What we actually needed was a multi-threaded pre-loader running on two different machines. With some code optimizations and 16 threads pre-loading the pages into memcache, it took us about 3 hours.

More about email deliverability

December 20, 2009 Indraneel 1 comment

Sometime ago I wrote a post about how to send emails without getting flagged as SPAM. Unfortunately I had missed out on a few things.

rDNS

One of the most important things in email deliverability is rDNS (also known as the PTR record). rDNS stands for “reverse” DNS. Here is an example how rDNS works:

  • A mail header says that the sender is abc@abc.com and it was sent from 111.222.333.444
  • The receiving mail server verifies if 111.222.333.4444 really points to abc.com by a rDNS lookup

Mail servers like AOL and Google are very very particular about rDNS i.e – they would check for reverse DNS entries for each mail received by them. Your important newsletter is almost sure to make it to the bulk email folder if you haven’t setup the reverse DNS entry for your domain.
Now if you are hosting your nice web-application on Amazon EC2, reverse DNS won’t work for you, because Amazon won’t set your reverse lookup. You will have to use a third-party email service like authsmtp or fastmail to send out mass emails with maximum deliverability.

Domain Keys Identified Mail

Domain Keys Identified Mail is a method for email authentication (as the DKIM website says). Basically DKIM allows the sender of an email to sign the email using public key cryptography. Prominent email services like Yahoo, Gmail and Fastmail implement DKIM. This is how it works in a nutshell:

  • The sender of the email adds a header-field named “DKIM-Signature” which contains a digital signature of the contents of the header and body of the email message
  • The receiving SMTP server does a DNS lookup and gets the public key for the domain.
  • It uses the public key to decrypt the message

You can use Javamail with DKIM for sending out that important newsletter from your web application.

Sender policy Framework

SPF is a special format DNS record which specifies which machines can send emails for that domain.

  • For example the owner of abc.com can determine which hosts are allowed to send emails whose sender email address ends in @abc.com
  • Receivers who check SPF can reject messages from unauthorized hosts before receiving the message body

And here’s what the SPF record may look like:

abc.com TXT “v=spf1 ip4:111.222.333.444 –all”

White lists, Black lists

  • Blacklists are lists of domain names which are known to send SPAM emails. Basically they are the lists of known offenders.
  • Whitelists are the opposite – lists by which an ISP allows someone to bypass spam filters when sending emails to its subscribers


So …. Get Registered

MxToolBox and ReturnPath are both good options

JIRA-git integration

December 19, 2009 Indraneel Leave a comment

I’ve been messing around with JIRA for quite sometime now. So when one of our product teams got stuck with integrating JIRA with git, I tried to help. There was a JIRA-git plugin over here and it worked fine. Problems arose when the plugin did not pull out commits to the branches in JIRA. So I went ahead and fixed it. Now the plugin pulls up all the changes from the branches but unfortunately it does not show the branch name along with the name of the changed file in the JIRA tab. I’ll try and fix that soon. In the meantime you can download and use the modified plugin from this location.

Continuous Integration and Automated Deployment for PHP

January 16, 2009 Indraneel Leave a comment

I had set up the Continuous Integration for a few Ruby on Rails products including Workstreamr and PaidInterviews. It was easy with rake, the ci_reporter plugin, rcov and Bamboo. I also used Capistrano for automated deployments after each build. A new product development started a few days ago and I was called upon to setup the Continuous Integration platform again. It was different this time though. This new product was to be developed in PHP.
Rake is a general purpose build tool just like ant and maven. And Capistrano is the best deployment tool I have encountered so far. So the choice was easy. This is how I created a rake task to execute all the unit tests and produce test coverage reports with PHPUnit.

Yeah, yeah, I know, I could have done the same thing with a shell script. But when I start creating tasks for database migration and such, things are going to get messy and tracking dependencies wouldn’t be easy with shell script.

PHPUnit can also produce coverage reports. So I included the second task which would produce coverage reports.

Now for the deployment script.

And now I bundle all this up neatly in a shell script and give it to Bamboo. Here is what my build-deploy.sh script looks like:

#/bin/bash
rake build:report_coverage
cap remote deploy

Back to blogosphere

November 2, 2008 Indraneel Leave a comment

September was a crazy month. PaidInterviews got launched at DEMO and I worked really very hard. By the end of September, things eased out. I got some breathing room. The financial crisis had changed the world by then. I took a couple of vacations for Durga Puja and Dewali. And oh boy, did I need them! I’m well rested and focused now. I’ll post a few things very soon.

Categories: Product Engineering Tags:

How to avoid getting “Flagged as Spam” while sending legitimate emails

July 15, 2008 Indraneel 4 comments

The menace called SPAM is a double edged sword. On one hand we have to constantly fight to keep SPAM from reaching our mailboxes. On the other hand we have to be careful so that the legitimate emails that we send don’t end up being caught in the net of anti-spam software. Take the example of sending automated emails to people who sign up for the beta version of an exciting product that we are building. These emails are not unsolicited. The ultimate control for the delivery of such emails reside with the servers receiving the emails and the anti-spam policies and software they implement. However following the guidelines below will reduce the chances of getting flagged by anti-spam software drastically.

I. Don’t spoof your identity.

Be accurate in who you are and from where you are sending the email. It’s always good to send your emails from your own servers, using your own domain name. For example if you have an application running on a machine in the domain example.com and you are sending emails from a mail server in that domain, it’s preferable that the senders address be someone@example.com. If you try to hide your source and destination you’ll look like spam. Don’t’ add unnecessary headers to the emails.
The email with the following header went to the junk email folder.


Received: by *ip-xxx-xxx-171-173.ip.xxxxserver.net* (Postfix, from userid 99)
id 7C95F298101; Wed, 9 Jul 2008 05:16:24 \-0700 (MST)
Received: from ip.secureserver.net (ip-xxx-xxx-171-173.ip.xxxxserver.net [127.0.0.1])
by ip-xxx-xxx-171-173.ip.secureserver.net (Postfix) with ESMTP id 6385E2980F1
...
Date: Wed, 9 Jul 2008 05:16:23 \-0700
From: *admin@somedomain.com*

The domain names of the “Received: ” and “From:” fields don’t match.

II. Genuine domain names

Use a domain name which is identified by a verifiable IP address. For this reason, it is very important to have rDNS (Reverse DNS) entry, also known as a PTR record for the server from which you are sending emails. Most anti-spam software reject emails sent from servers that don’t have an rDNS entry.

The email with the following header went to the junk email folder.

Received: from *unknown* (HELO ip-xxx-xxx-171-173.ip.xxxxserver.net) (208.109.171.173)
by k2smtpout06-01.prod.xxxx.xxxxserver.net (xx.xx.189.102) with ESMTP; 08 Jul 2008 06:34:42 \-0000

“unknown” in the “Received” header means the receiving server could not determine the identity of the server sending the email which generally turns out to be the lack of an rDNS entry for the sending server.

III. Send well constructed emails

Emails with missing mime sections, invalid or missing message-ids, invalid or missing date headers, or subject etc., are frequently signs of spam.

IV. Encodings

Avoid needless encodings and charsets in the emails. Don’t use base-64 encoding unless you really need to.

Consider the Subject field in the header below:
Subject: =?iso-8859-2?B?U1BBTTpSZWdhaW4geW91ciBuYXR1cmFsIHdlbA==?=
=?iso-8859-2?B?bG5lc3M=?=
Content-Type: text/plain;
charset="iso-8859-2"

The character set in this email is ISO-8859-2 which is the unicode encoding that a lot of eastern European countries like Hungary, Poland et al. use. This message ended up in my junk mail folder.

The following is a part of the header which ended up in my junk mail:

X-OriginalArrivalTime: 13 Jun 2008 13:52:08.0648 (UTC) FILETIME=[AC00D480:01C8CD5C]
\--Apple-Mail-16-982198482
Content-Disposition: inline;
filename="Picture 6.png"
Content-Type: image/png;
x-mac-hide-extension=yes;
x-unix-mode=0644;
name="Picture 6.png"
Content-Transfer-Encoding: base64

V. HTML emails

Malformed HTML

If you’re using HTML emails then the least you can do is to make sure that the HTML is valid. Unbalanced and invalid tags are bound to flag an email as spam.

Invisible text in HTML

If you’re using HTML emails, do not use invisible text within those emails. Make sure your text colors and sizes are distinct enough and large enough to read. Invisible text (e.g – text color is the same as background color) is often identified as a sign of spam.

Consider the following logo in a HTML email:

Welcome Everyone

The following is the source for this:
< p style="color:#4d4d4d;font-family:Arial,Helvetica,Verdana,sans-serif;font-size:24px;font-weight:bold;margin:0;padding:5px 0 0 5px;">Welcome<span style="color:#808080;"> Everyone</span></p>

This is good and valid. However the following is not:

Since it’s some text disguised as an image.

VI. Keep it simple

Do not use cute spellings, Don’t space out your words, don’t put str@nge l3tters 0r characters into your emails. You are bound to look as spam if you do.

For example some people emphasize/stylize the text by writing:
L E G I T I M A T E .
Text like such will make the email likely to get caught in spam filtering.

If you found this post helpful, there is some more stuff on the same topic in a later post of mine over here

Winning strategies

June 30, 2008 Indraneel 1 comment

I’ve been serving in the Agile army for sometime now. Being a foot-soldier I’ve been pretty close to ground-zero and hence to the ground realities.

Being Agile is hard. It may sound a little odd, but it is true.

Discipline

Being Agile requires quite a bit of discipline. In fact success with Agile depends on it. We were developing this shiny new product. For quite some time during one iteration our builds were failing continuously. Though we had a continuous integration platform, we lacked the discipline to make sure that builds passed everyday. Eventually when the build statistics showed us all reds for about 2 weeks continuously, we kinda woke up to it. It took us a few days to figure out what were actually wrong with the builds. Had we acted upon it the very first day the builds failed, it would have taken us a lot less time to figure out what was wrong. Moral of the story – “Having a kickass tool or platform does not mean we follow Agile methodology. The entire team has to make sure that the tools are used in the everyday life of product development. And that, requires discipline

Planning

Being Agile does not mean that we try to build an aircraft today, a wicker basket tomorrow and a flower vase the day after. Agile does not mean ‘no planning’. In fact Agile is a lot about planning.

Which bring me to iteration or sprint planning. A functionality freeze at the beginning of an iteration or sprint is not a “nice to have“. It’s a necessity for delivering a “quality” product “on time“. In one of the products that we were developing, the product manager saw the need to modify the functionality in the middle of the iteration. He had valid business reasons to do that, but that iteration was a nightmare. The quality of the product suffered badly, the developers were stressed and stretched to their extremes and the engineering manager spent sleepless nights. Unless billions are at stake, I don’t recommend doing it ever.

Design

There is no substitute for good design.
Bad Design + Flawless code = Bad product. Nobody would buy a machine which has a super fast processor but can only be started after opening the casing and finding the right wires to connect.
Excellent Design + Bad Code = Still a bad product but it’s relatively easy to fix it. Design flaws are very very expensive to fix. The chances of getting a better product increases geometrically when the design is given the enough amount of time to iterate and mature.

Teamwork

The whole team needs to have a holistic view of the product. People working in silos with blinders on don’t make a good team and are usually not at their productive best. This is a issue of epidemic proportions. I have seen it in mom-and-pop software shops as well as in Fortune 100 organizations. Good software requires more than collaboration. Each member of the team needs to understand and appreciate the big picture of the product, the purpose the product will serve and the kind of users who are likely to use the product . More often than not, engineers underestimate, misunderstand or simply ignore the value of having this holistic view. It’s not enough if only the functionality is implemented right.

Cross functional skills in the team are the need of the day. A job done by keeping in mind that one’s output will become someone else’s input is a job done well. A UI designer who knows a little about coding will be able to design the UI in a way which can make coding a breeze. An eye for detail can save a lot of time by
cutting down on rework. A developer who keeps an eye open for obvious design or UI errors can reduce rework manifolds. For example if the input boxes in a web-page are not properly aligned the developer should contact the UI developer or the designer and get it resolved before starting to code. It would take quite some rework if the non-aligned imput boxes made it to a QA build.

In the end, the success of a product depends on a lot of factors. Some of these we don’t have control over. But we gotta do our best with the factors we can control.

This is the first post in this series. There is more to come.