Data Science at Flurry

Data Science at Flurry

Author: Soups Ranjan, PhD


Flurry, being an integral part of the mobile app ecosystem, has very interesting data sets and consequently highly challenging problems. Flurry has insights into the 370K apps that have the Flurry Analytics SDK installed across the 1.1 billion unique mobile devices that we see per month. While there are corollaries to the web world, there are quite a few data science problems at Flurry that are just brand new and not necessarily derivative of the analogous web world problem.



The most challenging problems that the data science team at Flurry deals with are estimating user characteristics (e.g., age,  gender, and interests of app users) and predicting responses to advertising (and therefore which ads should be served to which people).



In this post, I’ll describe our approach towards solving one of the data science problems in-depth: the ad conversion estimation problem. Before we serve an ad to a user inside an app, we’d like to know the probability that the user will click on that ad. Given that we have historical information on which ads were clicked on, we can use that as our training sample and train a supervised learning algorithm such as Logistic Regression, Support Vector Machines, Decision Trees, Random Forests, etc.


Problem Definition: We treat this as a binary classification problem where, given a set of ad impressions and outcomes (whether the impression resulted in a conversion or not), we train a binary classifier. An impression contains details across three dimensions: user, app, and ad. Users in the Flurry system correspond to a particular device as opposed to a particular person. User features include details about the device (such as OS version), device model type, and prior engagement with mobile apps and ads. App features include details about the app, such as its category, average Daily Active Users, etc. Ad features capture details such as the advertiser’s bid price, ad format (e.g., whether it is a banner ad or interstitial), and ad type (e.g., whether it is a CPC [Cost-Per Click], CPV [Cost-Per-Video view] or CPI [Cost-Per-Install] ad). We also use time-specific features such as hour of day (in local time), day of week, and day of the month, as well as location-specific features such as country or city, as all of these have an important bearing on the probability of conversion.

 

More specifically, we use Logistic Regression to solve this problem. Logistic Regression is used for predicting the probability of an event. It predicts by fitting the data to a logistic curve. Consider an ad impression Y which has the features defined in a set X= {x1,x2,…, xn}. We define a logistic function f(z)= P(Y=1|X=x). Note that f(z) is 1 when impression Y converts and it  is 0 when it doesn’t convert. We define f(z) = 1/ (1 + exp(-z))where z is a logit and is given as:


where, β0 is called the intercept and βi are the regression coefficients associated with each xi feature. Each regression coefficient describes the size of the contribution of that particular feature. A positive regression coefficient means that the particular xi feature contributes to probability of outcome of 1, indicating the feature has higher predictive capability in classifying an impression as a conversion.  Similarly, a negative regression coefficient means that the particular xi feature contributes more to probability of outcome of 0, indicating the feature is has higher predictive capability in classifying an impression as a non-conversion.


Data deluge or how much data is enough? One of the most important design criteria is the number of ad impressions we use to train a model. Our experience with this is that after a while the marginal gains in the model’s performance using larger data sets is not worth the higher costs associated with using larger data samples. Hence, we use MapReduce jobs to pull data from our HBase tables and prepare a training set that consists of tens of millions of ad impressions, that is then fed into a single high-performance machine where we train our models. Our feature space consists of tens of thousands of features, and as you can imagine, our data is highly sparse with only some features taking values for an impression.


Interpretable vs. black-box models: Recently, black-box techniques such as Random Forests have become highly popular among data scientists; however, our experience has been that simple models, such as Logistic Regression, achieve similar performance. Our usual approach is to first test a variety of models. If the performance differences are not significant, then we prefer simpler models, such as Logistic Regression, for their ease of interpretability.


Offline batch vs. Online learning: At one end of the spectrum, offline learning algorithms can learn a sophisticated model via potentially multiple passes over the data (also, referred to as batch algorithms) where the learning time is not a constraint. At the other end of the spectrum, there are online learning algorithms (such as the very popular tool, Vowpal Wabbit), where we attempt to train a model in near real-time by making one pass on the data at the cost of giving away some of the accuracy. With this approach, we use the model as developed so far to score the current ad impression and then use this same impression to recompute the model’s parameters. The memory requirements are much lower as we iteratively learn new weights while considering the current impression, so we can play with a much larger feature space. We can then explore non-linear feature spaces by considering polynomials of our regular features, e.g., (avgDAU2,.., avgDAUn) as well as to capture the cross-correlation (or interaction) between features, e.g., avgDAU * hourOfDay, which intuitively captures information about number of active users in this app at a particular hour of the day.


Time to score a model: A highly important consideration is the ability to score an impression against the model within tens of milliseconds to estimate its conversion probability. The primary reason is that we don’t want to make a user who’s waiting to see an ad (consequently, waiting to get back to their app after an ad) wait for a long time, leading to a poor user experience. Given this consideration, several models that would otherwise be well-suited to this problem simply don’t qualify because of this constraint. For instance, Random Forests don’t work in our case because in order to score an impression we would have to evaluate it against tens, or even hundreds, of Decision Trees, each of which may take awhile to evaluate the impression feature-by-feature down a tree. Granted, one could parallelize the scoring such that one impression gets scored via one Decision Tree on one machine core; however, in our experimental evaluations, we didn’t see any noticeable gains from Random Forests to even justify going down this route.


Unbalanced data: Another interesting quirk of our problem space is that conversions are a highly unlikely event. For instance, for CPI ad campaigns, we might see a few app installs per thousand impressions. Hence, the training set is highly unbalanced with many, many more non-converting impressions and very few converted impressions. So a simple, yet incorrect, model may perform quite well by always predicting every impression as non-converting, but that would be self-defeating since we are actually interested in predicting which impressions lead to conversions. To avoid this, we assign a higher weight to converting impressions and a lower weight to the non-converting ones so that we can still get an adequate performance while predicting conversions. We learn the weights via cross-validation, while setting the weights in different ratios during different cross-validation experiments, and then selecting the parameter that maximizes our performance.


Overfitting and Regularization: One of the advantages of Logistic Regression is that it allows us to determine which features are more important than others. To achieve this, we use the Elastic Net methodology developed by Tibshirani et al in by incorporating it within Logistic Regression. Elastic Net allows a trade-off between L1- and L2- regularization (explained below) and allows us to obtain a model with lower test prediction errors than the non-regularized logistic regression since some variables can be adaptively shrunk towards lower values as shown in the equation below. The formulation below can be interpreted as follows. During a training phase and for given hyper parameters, λ and 0 ≤ α ≤ 1, we find the coefficients β which minimizes the argument in Equation 1. When α = 1(0), this reduces to L1-regularized (L2-regularized) logistic regression. When there are multiple features that are correlated with each other, then L1-regularization randomly selects one of them while forcing the coefficients for others to zero. This has the advantage that we can remove the noisy features and select the ones that matter more; however, in the case of correlated features, we run the risk of removing equally important ones. In contrast, L2-regularization never forces any coefficient to zero, and in the presence of multiple correlated features, it would select all of them while assigning each of them equal weights. Varying the hyper parameter α at values other than 0 or 1, hence allows us to keep the desirable characteristics of both L1- and L2-regularization.


Performance: To measure the performance of our machine-learning models, we compute Precision and Recall, defined as follows:


  • Precision = Impressions correctly predicted as conversions / Total predicted conversions

  • Recall = Impressions correctly predicted as conversions / Actual conversions


Fig. 1: Precision and Recall for different models


Figure 1 above, shows the performance of various models when applied to a hold-out test data set. The plot shows precision and recall values, and in general a model that captures more Area-Under-the-Curve (AUC) is better. So, the L1-regularized Logistic Regression with  λ = 0.001 performs the best, followed by almost similar performance by  λ = 0.01 and an Online learning model built using Vowpal Wabbit. The most interesting insights here are that a model built with a smaller λ = 0.001 value performs the best as it forces more shrinkage in the feature set, leading to many more irrelevant features forced to have weights of 0, and hence this model avoids overfitting leading to higher performance in test data set. The other highly interesting insight is how an online learning model has very comparable performance to the batch learnt one.


 

Fig. 2: Regression weights for localHourOfDay


Next, we take a look at the interpretability of Logistic Regression in greater details. For instance, Figure 2 shows the coefficients assigned to different hours of the day. Note that hours of 5 pm, 7 pm and 8 pm have a positive coefficient indicating that higher values in these features indicates an increase in probability for the impression to be classified as converted which matches our intuition since evening time is prime time to show ads. For instance, localHourOfDay of 5 pm has a coefficient of 0.16 which means that exp(0.16) = 1.1735, i.e., the odds p/(1-p) of classifying an impression as converted vs. not converted at 5 pm increases by 17.35%. In contrast, the localHourOfDay of 1 am, is assigned a negative coefficient, which once again matches our intuition in that higher values for this feature indicate a lower chance for an impression to be classified as converted.


Conclusion: Hope you enjoyed reading our introductory blog post on Data Science at Flurry. If you have any feedback or would like to learn more about any particular content mentioned here, please do leave a comment below. And above all, stay tuned for more insightful articles from the Data Science team!

 
Standard

JavaScript Lessons From the Flurry Trenches

The early 2000s was an interesting time for the internet.  Windows XP and Internet Explorer 6 dominated in market share.  A young, dashing, intrepid, future engineer was just learning the ways of the web.  His tools of trade: analyzing scripts from DynamicDrive that made falling snowflakes and browsing partially constructed websites hosted by GeoCities.

90sScreenShot

Looking back now, I cannot believe all the misconceptions I picked up about JavaScript by trying to learn the language from such poor examples.  Initially, I thought of it as nothing but a simple scripting language for making silly effects.  Since then, both the internet and I have matured.  The average level of JavaScript expertise being published has greatly increased and I have cleared up those misconceptions.  The times when I have made the greatest gains in understanding JavaScript have come from investigating the more difficult nuances of the language.  Hopefully, by discussing some of these details, I can capture your curiosity and motivate further learning.

These details about JavaScript might seem very complex to a beginner.  Honestly, most of the things you need to do on a daily basis as a frontend engineer do not require intimate knowledge of these topics.  However, working at Flurry, over the course of many years, I have encountered several problems where knowing these details have allowed me to come up with much better solutions than if I were fumbling around blindly.  A good craftsman should not only never blame his tools, but should also understand as much about how they can be used as possible.

Objects, Primitives, and Literals, oh my!

There seems to be some confusion as to whether primitives exist in Javascript or if everything is an object.  The easiest ones to rule out are null and undefined.

The three exceptions are strings, booleans, and numeric values.  These can be explicitly created as objects using String, Boolean, and Number objects. However, if variables are defined using literals (e.g. “foo”, true, 3), they will contain an immutable primitive value.

These values will be automatically cast as the object version of the primitive when necessary, allowing for the object methods to be referenced.  The object is immediately discarded after use. In this example, I will use Java-like syntax to explain what is going on behind the scenes.  In this pseudocode, assume there exists a primitive type string and a corresponding String object just like Java’s.

//Javascript Version			//"Java" version
var primitive = "foo";			string primitive = 3;

primitive.charAt(0);			((String) primitive).charAt(0);
//f					//the object created by casting is 
					//not saved anywhere

primitive.someProp = 3;			((String) primitive).someProp = 3;
					//the object created by casting is not 
					//saved anywhere and neither is someProp

alert(primitive.someProp);		alert(((String) primitive).someProp);
//undefined				//someProp does not exist on primitive 
					//because it is a primitive value and not 
					//an object

To the casual observer, it might seem like these literals are created as Objects because they react to all the expected method calls that the Object versions of the primitives would.  The automatic casting built into JavaScript because of its nature as a weakly-typed language assist in perpetuating this illusion.

Further Reading:
Mozilla JavaScript Reference: Primitives and Objects
JavaScript Garden: Objects

Passing by Reference…JavaScript can do that? Nope.

Modern languages abstract a lot of memory management complexity away from the user.  Anyone who has learned about pointers in C/C++ would likely find this a helpful feature for the general case. However, it is still important to understand the basics of what is being abstracted away from you.

See this little snippet of code:

function change(primitive, obj1, obj2){
	primitive = 1;
	obj1.prop = 1;
	obj2 = {prop : 1};

	console.log(primitive); //1
	console.log(obj1); 	//{prop : 1}
	console.log(obj2); 	//{prop : 1}

	obj2.prop = 2;
	console.log(obj2); 	//{prop : 2}
}

var a = 0,
   b = {},
   c = {prop : 0};

change(a,b,c);

console.log(a); 		//0
console.log(b); 		//{prop : 1}
console.log(c); 		//{prop : 0}

If you read the comments about the output of that code, you might be a little surprised to see that c did not change after the function call.  This is because JavaScript is much like Java in that it only passes arguments by value and secretly uses pointers to pass object arguments around.

The primitive/a argument is simply passed by value, which means that a copy of the value in a is used by the function change when dealing with primitive.  None of the changes to primitive are propagated to a.

Now, if we look at the obj1/b example, you might make a claim like “JavaScript supports pass-by-reference for objects”.  I certainly did for a long time before I realized what was actually happening.  So, if obj1 is a copy of a value, what is that value?  It is a copy of b, which is really a pointer and contains the address in memory where the instance of b is stored.

In C/C++, the developer must be cognizant of whether some variable is an object or pointer.  There are different operators for each case, the dot and arrow operators.  The dot is used for directly accessing an object’s methods.  The arrow is used for accessing a method of pointer to an object.  In JavaScript (and Java), there is only one such operator, the dot.  This is because there is no way to separately refer to an object and a reference or pointer to an object.  Every time you think you are dealing with an object in JavaScript, you are using a pointer to that object, but all of that complexity is abstracted away from you as a developer.

Now that we (hopefully) understand that very, very intricate concept, we can explain what is happening with obj2/c.  Because obj2 is copy of the memory address of c, we can operate on it like we would any other object in the scope of change.  However, when we assign obj2 with the object literal {prop : 1}, we are overwriting the memory address used by obj2 and performing further operations on that new memory address.  c still references its original memory address and that is why it is unchanged outside of the change function.

Further Reading:
Details on Java’s Pass-By-Value

Prototypes, All the Way Down

JavaScript is very unique amongst the popular programming languages today in its use of prototypal inheritance chains.  Every object in JavaScript has a prototype that is a reference to an instance of another object until the chain reaches the built-in Object.prototype.  Each object instance copies only the subset of properties and methods for that object and relies on a pre-existing instance for the things that need to be inherited.  When a call to a method is made on an object, the engine will go up the prototype chain looking for that method.  This can present advantages like memory savings and disadvantages like creating unexpected conflicts down the line when changing the prototype of some object higher up the chain.

There are many good articles out there that will delve deeper into JavaScript prototypes.  What I find interesting is how the new keyword works in this context.  When new is used in front of a function, that function becomes a constructor for a new object whose prototype is that function’s prototype.  Remember, functions are first-class citizens in JavaScript and behave very similarly to objects.

Essentially var x = new foo(); becomes

var x = {};
x.constructor = foo; 		//done internally, not actually writable by
				//the developer in the same manner
x.prototype = foo.prototype; 	//some browsers access this via __proto__
foo.call(x); 			//foo is called with x placed in the execution context (this)

This behavior where new is actually creating an object before calling the function is important for understanding how the this keyword works.

this, very, very simply put, refers to the execution context a function is being called from.  An execution context can be thought of as the object a function is called on.  The global scope is also an object, window.

Below are all the different values this might have in different contexts.

//this == window;
(function(){
	//this == window;
	var x; 		//in the function scope
})();

var obj = new function(){
	//this == obj;
}();

var obj2 = {
	fn : function(){
		//this == obj2
	}
}

var y = {};
(function() {
	//this == y
}).call(y);

Because of all the different values this can have, a lot of developers will save references to the this context with their own variables like self, that, _this.  While I think this practice can be very useful, I would recommend more explicit variable names that actually describe what the context is supposed to represent.

Further Reading:
Mozilla JavaScript Reference: How prototype Works
Mozilla JavaScript Reference: How call Works
JavaScript Garden: this

Closing it Out with Closures

JavaScript supports function scope but not block scope.  This is a departure from most languages.  A scope is usually a block of code enclosed by some braces; variables declared in that scope will not exist outside of that scope.  In JavaScript, if/while/for blocks will leak variables declared in their blocks to their parent scope.

Here is some code that captures this little gotcha.

var projectIds = [1,2,3,4]
    index = 0,
    size = projectIds.length
;

for ( ; index < size ; index++) {
	var projectLink = getProjectLink(index);
	projectLink.on("click", function(){
		makeAjaxRequest("/getProjectDetails.do?projectId=" + projectIds[index]);
	});
	addToPage(projectLink);
}

This code snippet will cause each project link to fetch the details for projectId=5.

Why?  The for loop puts index in the global scope and, after it runs, index will persist with a value of 5.  Because the click handler runs in the global scope and after the completion of the for loop, it will use index=5.  Alternatively, if this code were enclosed by some function, the click handler would throw an error because it wouldn’t find index or some random value set by some other function that put index in the global scope.

However, if we slightly modify the code, we can make this work as intended.

(function(){
	var projectIds = [1,2,3,4]
	    index = 0,
	    size = projectIds.length
	;

	var addClickHandler = function(link, projectId){
		link.on(“click”, function(e){
			makeAjaxRequest("/getProjectDetails.do?projectId="
				 + projectId);
		});
	};

	for ( ; index < size ; index++) {
		var projectLink = getProjectLink(index);
		addClickHandler(projectLink, projectIds[index]);
		addToPage(projectLink);
	}
})();

Why does this work? Closures.

A closure is a combination of a function and its referencing environment.  Because JavaScript only has function scope, this means that all functions are closures and always have access to their outer scope.

In this example, when addClickHandler is called, it creates a function with a copy the value of projectId in its scope.  The click handler will now access that reference but the for loop will not change it.

Further Reading:
Douglas Crockford’s JavaScript: The Good Parts section on closures

Author: Kenny Lee

Standard

Source Code Analyzers as a Development Tool

It is difficult to write consistent and high quality code when using libraries/sdks from multiple sources and when development is distributed between several teams and multiple time zones.  Many challenges exist for both new and experienced developers including lack of documentation, insufficient unit test coverage and nuances to each platform/sdk that make things different. It becomes necessary for developers of one platform to understand complicated legacy code of an unfamiliar platform. To make things more complex, it may be written in a language they do not understand well.  It is estimated that upto 60-80% of programmer’s time is spent to maintain a system and 50% of that maintenance effort is spent understanding that program (http://www.bauhaus-stuttgart.de/bauhaus/index-english.html).

It is helpful for developers to have  tools that can analyze different codebases quickly. A rather comprehensive list of source code analyzer tools for each platform is listed here: (http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis) .  Since an in-depth comparison of the multitude of analyzer tools is beyond the scope of this article, all figures and analysis were done by using Understand by Scitools, a typical albeit premium source code analysis tool.

Understand by SciTools

Understand by SciTools has the ability to scan Ada, Cobol, C/C++, C#, Fortran, Objective-C, Java, Jovial, Pascal, PL/M, Python and others. Like many multi-language analyzer programs it is not free, however, the benefits of such a program are enormous. (For the purposes of this demonstration, a deprecated and unused codebase was analyzed.)

After source code files have been parsed, you’ll see multi-windowed view like the following:

Figure 1. Parsing of project sources

Figure 2. IDE source windows.

Pros of Source Code Analyzer tool

The ability to see how a variable/method in the project is analyzed, how it is used (or unused) in the project, what methods call it or what methods are called by it are a quick way to get reference information. This can also be done quite easily in various IDEs of the respective languages – Xcode, Eclipse, IntelliJ, etc. What sets premium multi-language source code analysis tools apart from IDEs is the ability to see graphically how the source code is structured and to be able to run metrics on them as a whole.  In Figure 3, for example, we notice that the androidTestApp has 113 references to the androidAdAgent library. The flurryAdTestApp has about 25 dependencies on such project. Further analysis of this project reveals that the flurryAdTestApp is a generic sample project for testing ad functionality while the androidTestApp is a more universal testing application. There are many more benefits to know this internal dependencies – for example, knowing how complex a dependency exists makes it easier to understand how much QA is required if such code is refactored.

Figure 3. Internal architecture complexity

Figure 4. UML class diagram

Figure 4 shows the overall UML class structure of the project. This is particularly useful if you need to refactor or re-engineer your code base to specific design patterns. Some of these analyzer tools allow you to directly manipulate the uml diagrams and the underlying code structure is changed as a result.

Figure 5. Unused variables and parameters

The ability to see unused variables and parameters is particularly useful in reducing code bloat and keeping the codebase lean.

Figure 6. Check code for various coding standards.

Quite a few of these premium analyzer tools have code check algorithms that allow you to be notified of overly complex algorithms (big-O, number of lines, cyclomatic dependencies, and unused code paths). Overly complex programs are difficult to comprehend and have many possible paths making them difficult to test and validate. Most analyzer tools allow you to record/program your own macros for determining proper code validation:

– Improper use of .equals() and .hashCode()

– Unsafe casts

– When something will always be null

– Possible StackOverflows

– Possible ignored exceptions

– Cyclomatic Compexity (modified, strict)

Cons of Source Code Analyzer tool

One drawback of using such source code analysis tools is that you have to configure the project to find all relevant sources for each project type – Failure to have a project properly configured could result in too much or too little detail that’s useful. Source code that is significantly modular across libraries would be difficult to analyze.  Also, many tools that are out there are quite one-dimensional in that they may check coding style but may not be able to provide detailed analysis on code complexity and improper algorithmic complexity. One open source tool that shows promise is sonar (http://www.sonarsource.org) but it’s big drawback is that it requires a web server and a database. Other possible issues are that some code analyzers analyze the bytecode while others analyze the source. Whatever source analyzer tool is chosen it may not be comprehensive enough for the organizational needs.

At Flurry, we have multiple sdk codebases – Objective-C, java (android), BlackBerry10, windows phone 8/7, windows 8, HTML5. Each additional analyzer tool needs a significant learning curve. We try to keep the tools to a minimum while still trying to get coverage on the different codebases.

Source Code Analysis As Part of the Development Environment

There are many programs that analyze source code but only few provide support for multiple languages and are open-source. The most popular ones for java are PMD, FindBugs and Checkstyle.  However a tool that is simple, multi-language and open-source would be ideal.

The ability to easily see the UML structure of a program, understand it’s code complexity, and see all the dependencies with the click of a button can easily replace invalid comments and outdated documentation. A good source analyzer tool is only one part of a toolbox that should be available to the developer. Unit tests, code reviews, and pair programming should always have priority but a source code tool can definitely help the developer (both new and experienced) to keep track of large codebase that he may be working on.

Author: Richard Brett

Standard

Interview Strategies at Flurry

To meet the demands of the market, Flurry has grown rapidly. Our numbers have roughly doubled each year, and we plan to continue this trajectory for the foreseeable future. Because of this pattern, we spend a significant amount of time interviewing candidates. Here is our approach.


Synopsis

Untitled

Our process is designed to help us discover as much as we can about each candidate as efficiently as possible. We begin with a phone screen, which is a low effort, low cost handshake between us and the candidate. This gives us a basic sense of their programming knowledge and their compatibility with us. Then, we follow up with a code test that evaluates their ability to design and implement a solution to the proposed challenge. Finally, we conduct two onsite interviews, one of which covers the design decisions made in their code test and their technical abilities, and another which covers their creative problem solving skills and delves into their past experiences.

During the two interviews, we assess the following:

  • Communication skills

  • Problem solving skills

  • Design ability

  • Team fit



Giving Hints

Most candidates will get stuck solving a problem at least once during the interviews. This is our opportunity to direct the conversation with a hint. A good hint can sometimes be very difficult to give. If an interviewee has chosen to answer a problem solving question in a way that is unusual but not necessarily incorrect, and then gets stuck, it is up to us to think ahead and supply a hint that helps them get to where they want to go. This can be especially challenging if the interviewee knows more about the subject than we do. The alternative is to give a hint that sets the candidate onto a path that leads to a known good solution, but this has several drawbacks. First, it wastes time because the interviewee must start again. Second, the candidate often still has their original idea in mind, which distracts them. Finally, pointing someone in a completely different direction is jarring and throws them off, especially if they start analyzing whether they have jeopardized their chances at an offer by getting the question wrong. Ideally, a good hint will provoke a thoughtful discussion about how to arrive at a solution. This also introduces a collaborative element to the interview and gives the interviewee a chance to teach us something. By digging in like this, we gain a wealth of information about their ability to communicate and how they approach problems.

 

Reviewing the Code Test

The code test tells us several things about the candidate:

  • How they write code in a natural setting

  • How they structure blocks of code – is it organized into logical units of work?

  • How they test their code – are their tests concise and relevant? What do the tests cover? Is their code designed to be testable?

  • How they design their algorithms – what approach does the candidate take?

Reviewing the test with the candidate mimics an actual code review. We discuss the compromises they made, such as performance, readability, and testability. We also explore the design decisions the candidate made to gain insight into how they prioritize their development practices. This also gives us a sense of how they respond to feedback: do they give justifications for their design decisions? Do they readily acknowledge mistakes?

 

Digging Deeper

Any candidate can memorize the answers to common interview questions, but eliciting such canned responses will not give us useful information about their abilities as a programmer. Therefore, we choose questions that lead to interesting discussion. For a hypothetical example, take the simple question of “What is a tree set?” If the candidate talks about the O(log n) find/insert/remove operations, we could follow up with “Since a tree is faster than a linked list for common operations, what are some reasons to use a linked list instead of a tree set?” This forces the candidate to explain their thought process, which is much more valuable than knowing whether the candidate can regurgitate runtime efficiencies of common structures.

Listening Well

At Flurry, we care about a candidate’s communication skills and fit just as much as their technical abilities. We pick up that information in a few ways. A lot of this type of information is gleaned during the technical portions. If a technical question is too hard or unexpected, we can look at an interviewee’s coping mechanisms – do they panic? What is their thought process – does it jump around or does it logically build from a set of premises? If a question is too easy, how does the candidate respond? Is he arrogant or dismissive? These are the keys which will tell us if an interviewee is confident about what they know and don’t know and is able to think for themselves. This kind of information can come from anywhere at any time, so we make sure to stay and take notes regardless of what it happening. By the end of the process, several different interviewers must come to a single conclusion, and if anybody thinks the candidate is not qualified, we pass on them.

Describing Flurry to the Interviewee

Anyone familiar with the process knows that interviews are just as much about helping the candidate learn about the company as they are evaluating the candidate. Since Flurry values intelligence and honesty, we want to attract the same qualities in our candidates. To that end, during the interview process, we are truthful about Flurry’s strengths and weaknesses and present a clear picture of what it will be like working here. The people interviewing the candidate are their future colleagues, and interviews take place in the same location as their future work environment. By the end of it, a candidate should have a good idea what it is like to work here, should they accept an offer.

 

 

Author: Jon Miller

Standard

APNS Test Harness

As Dr. Evil found out after spending a few decades in a cryogenic freezer, perspectives on quantity change very quickly. Ever since the explosion of mobile apps and the growing number of services that can deal with humongous amounts of data, we need to re-define our concepts of what ‘Big Data’ means. This goes for developers who want to write the Next Great App, as well as those who want to write services to support it.

One of the best ways of connecting with mobile application customers is via remote push notifications. This is available on both Android (GCM) and iOS (APNS). These services allow developers to send messages directly to their users and this is an extremely valuable tool to announce updates, send personalized messages and engage directly with the audience. Google and Apple provide services that developers can send push messages to and they in turn deliver those messages to their users.

Drevilpush

The Problem

It’s not unusual for apps these days to have in the order of millions and even tens of millions of users. Testing a Push Notification backend can be extremely hard. Sure, you can set up a few test devices to receive messages but how do you know how long it would take your backend to send out a large number of push messages to the Google and Apple servers? Also, you don’t want to risk being throttled or completely blacklisted by either of those services by sending a ton of test data their way.

The Solution

The solution is to write a mock server that’s able to simulate the Google/Apple Push Notification Service, and a test client to hit it with requests.

The Google service is completely REST based, so a script that executes a lot of curls in a loop can do that job. Also, it’s fairly straightforward to write a simple HTTP server and accepts POSTs and sends back either a 200 or some sort of error code.

Apple’s APNS, however, presents a few challenges. It’s a binary format listed here. Since the protocol is binary, you need to write some sort of mock client that can generate messages in the specified format. At Flurry, we’ve been playing around with Node.js to build scalable services and it’s fairly straightforward to setup an Apple APNS test client and server.

The Client

https://gist.github.com/rahuloak/4949310

The client.connect() method connects to the mock server and generates test data. The Buffer object in Node is used to pack the data into a binary format to send it over the wire. Although the protocol lets you specify a token size, the token size has been set to 64 bytes in the client since that’s typically the token length that gets generated. Also, in our experience, the APNS server actually rejects tokens that aren’t exactly 64 bytes long. The generateToken() method generates 64 byte hex tokens randomly. The payload is simple and static in this example. The createBuffer method can generate data in both the simple and enhanced format.
 
What good is a client without a server, you ask? So without further ado, here’s the mock server to go with the test client.

The Server

https://gist.github.com/rahuloak/4949381

After accepting a request, the server buffers everything into an array and then reads the buffers one by one. APNS has an error protocol, but this server only sends a 0 on success and a 1 otherwise. Quick caveat: Since the server stores data in a variable until it gets a FIN from the client (on ‘end’) and only then does it process the data, the {allowHalfOpen:true} option is required on createServer so that the client does not automatically close the connection.

This setup is fairly basic, but it is useful for many reasons. Firstly, the client could be used to generate fake tokens and send them to any server that would accept them (just don’t do it to the APNS server, even in sandbox mode). The data in the payload in the above example is static, but playing around with the size of the data as well as the number of blocks sent per request helps identify the optimal size of data that you would want to send over the wire. At the moment, the server does nothing with the data, but saving it to some database or simply adding a sleep in the server would be a good indicator of estimated time to send a potentially large number of push messages. There are a number of variables that could be changed to try and estimate the performance of the system and set a benchmark of how long it would take to send a large batch of messages.

Happy testing!

Standard

Tech Women

4th grade budding girl geek in the making 2nd row 2nd girl from the left

 
I grew up in a small suburb of New York fascinated with math and sciences. 3-2-1 Contact was my all-time favorite show then and getting their magazine was such a joy. As a young girl it was fun to try out the BASIC programs they published, programming with a joystick and running them on my Atari system (Yes programming with a joystick or paddle is just as useful as the MacBook Wheel.) It seemed like a no brainer to dive into computers when I started college. Women in my family were commonly in the sciences, so entering my college CS program was a bit of a culture shock for me; I could actually count all the women in my class year on one hand!
 
After graduating and working at a range of tech companies as a Quality Assurance Engineer, from big players to small startups, I’ve always had the desire to give back to the tech community. Only recently, however, did I find the right avenue. One day a co-worker of mine shared a link with me about the TechWomen program. From their website:
TechWomen brings emerging women leaders in Science, Technology, Engineering and Mathematics  from the Middle East and Africa together with their counterparts in the United States for a professional mentorship and exchange program. TechWomen connects and supports the next generation of women entrepreneurs in these fields by providing them access and opportunity to advance their careers and pursue their dreams.
 As soon as I read that, I applied right away.  This was exactly the type of program I was looking for to help share what I’ve learned.

 
It must have been written in the stars as I was accepted as a mentor in the program.  I was matched with Heba Hosny who is an emerging leader from Egypt.  She works as a QA Engineer at a Vimov, an Alexandria based mobile application company. During her three week internship at Flurry she was involved in the process of testing the full suite of Flurry products.

During Heba’s stay with us she was like a sponge, soaking up the knowledge to learn what it takes to build and run a fast-paced, successful company in Silicon Valley. In her own words,

“EVERYBODY likes to go behind the scenes. Getting backstage access to how Flurry manages their analytics business was an eye opening experience for me. I was always curious to see how Flurry makes this analytics empire, being behind the curtains with them for just a month has been very inspiring for me to the extent that some of what Flurry does has became pillars of how I daily work as a tester for Flurry analytics product used by the company I work for.

In a typical Internship, you join one company and at the end of the day you find yourself sitting in the corner with no new valuable information. You have no ability to even contact a senior guy to have a chat with him. Well, my internship at Flurry was the total OPPOSITE of that.

The Flurry workplace is different. In Flurry managers, even C levels, are sitting alongside engineering, business development, marketing, sales, etc. This open environment allowed me to meet with company CEO, CTO, managers, and even sitting next to the analytics manager.

 In short, an internship at Flurry for me was like a company wide in-depth journey of how you can run a superb analytics shop and what it’s like to deal with HUGE amounts of data like what Flurry works with .”

Working with Heba during her internship was a great experience. The experience of hosting an emerging leader was very fruitful. In QA we were able to implement some of the new tools Heba introduced to us, such as the test case management tool Tarantula. Heba also gave us the opportunity to learn more about her culture and gave members of our staff a chance to practice their Arabic. The San Francisco Bay Area is a very diverse place but this is the first chance many of us have gotten to hear a first hand account of the Arab Spring.

From our experience in the tech field, it’s obvious that the industry suffers from a noticeable lack of strong female leadership at the  top. It’s time that women who value both a rich home life and a fulfilling career explore the tech startup world. Participating in programs such as TechWomen will help in this regard. These programs benefit not only the mentee and mentor, but the industry as a whole. Mentees who gain experience in Silicon Valley tech companies will pay it forward to next generations of future tech women in their communities by sharing their experiences. Mentors in the program not only learn from their mentees but are able to create a sense of community to help make sure the mentee has a successful internship. Company-wise, participating in programs like TechWomen bring tremendous exposure to Flurry outside of the mobile community. As we enrich more women’s lives in the tech field, we can share even more experiences to help inspire young women and girls to know it’s possible to touch the Silicon Valley dream, no matter where in the world they are.

For more information:

Standard

The Benefits of Good Cabling Practices

An organized rack makes a world of difference in tracing and replacing cables, easily removing hardware, and most importantly increasing airflow. By adopting good cabling habits, your hardware will run cooler and more efficiently and ensure the health and longevity of your cables. You also prevent premature hardware failures caused by heat retention. Good cabling practices don’t sound important but it does make a difference. It’s also nice to look at or show off to your friends/enemies.

When cabling, here are some practices Flurry lives by:

Label everything

There has never been a situation where you’ve heard someone say, “I wish I hadn’t labeled this.” Labeling just makes sense. Spend the extra time to label both ends of the network and power cables. Your sanity will thank you. If you’re really prepared, print out the labels on a sheet ahead of time so they’ll be ready to use.

Cable length

When selecting cable length, there are two schools of thought. There are those who want exact lengths and those who prefer a little extra slack. The majority of messy cabling jobs are from selecting improper cable lengths so use shorter cables where possible. A good option is custom made cables. You get the length that you need without any excess. This option is usually expensive in either time or money. The other option is to purchase standard length cables. Assuming that you have a 42U rack, the furthest distance between two network ports is a little over six feet. In our rack build outs, we’ve had great results using standard five foot network cables for our server to switch connections. 

Cable management arms

When purchasing servers, some manufacturers provide a cable management arm with your purchase. They allow you to pull out your server without unplugging any cables. For this added benefit, they provide bulk, retain heat, and reduce proper airflow. If you have them, we suggest that you don’t use them. Under most circumstances, you would unplug all cables before you pull out a server anyway.

No sharp bends

Cables do require a bit of care when being handled. A cable’s integrity can suffer with any sharp bends so try to avoid this. In the past, we have seen port speed negotiation and intermittent network issues cause by damaged network cables.

Use mount points

As you group cables together, utilize anchor points inside of the rack to minimize stress on cable ends. Prolonged stress on the cable ends can cause the cable and socket it’s connected in to break. Power ends are also known to unplug. The weight of the bundled power cables can gradually unplug it at any moment. Using anchor points will help alleviate directed stress to the outlet.

Img_1935_3

 

Less sharing

Isolate different types of cables (power, network, kvm, etc) into different runs. Separating cable types will allow for easy access and changes. Bundled power cables can cause electromagnetic interference on surrounding cables so it would be wise to separate power from network cables. If you must keep copper network and power cables close together, try to keep them at right angles. Standing at the back of the rack, network cables are positioned on the left hand side of the rack while power cables are generally on the right in our setup.

Lots and lots of velcro

We’ve seen the benefits of velcro cable ties very early on. It’s got a lot of favorable qualities that plastic zip ties do not. They’re easy to add/remove and also retie. They’re also great when mounting bundled cables into anchor points inside of the racks. If your velcro ties come with a slotted end, do not give into the urge to thread the velcro into the ends. It’s annoying to unwrap and rethread. Don’t be shy to cut the velcro to length, either; using just the right length of velcro can make it easier to bundle and re-bundle cables. 

Now that you have these tips in mind, let’s get started on cabling a Flurry rack.

1. Facing the back of a 42U rack, add a 48 port switch in about the middle of the rack (position 21U (21st from the bottom). Once you have all your servers racked, now the fun part being, cabling. Let’s start with the network.

 2. From the top most server, connect the network cable to the top left port of your switch, which should be port 1.

3. As you go down the rack, connect the network cables on the top row of ports from left to right on the switch (usually odd numbered ports). Stop when you’ve reached the switch.

Img_1933_3

4. Using the velcro cable ties, gather together the cables in a group of ten and bundle the cabled groups with the cable ties. Keep the bundle on the left hand side of the rack. You will have one group of ten and one group of eleven that form into one bundled cable.

Img_1932_3

5. For the bottom set of servers, start with the lowest server (rack position 1U) and connect the network cable to the bottom left most port on the switch.

6. Starting from the bottom up, connect the network cables on the bottom row of ports from left to right on the switch (usually even numbered ports).

Img_1940_3

7. Doing the same as the top half of the rack, gather together the cables in a group of ten and bundle the cabled groups with the cable ties. Keep these bundles on the left hand side of the rack. You’ll end up with two bundles of ten that form into one bundled cable. Look pretty decent?

8. Now, lets get to power cabling. In this scenario, we will have three power distribution units (pdus), one on the left and two on the right side of the rack. Starting from the top of the rack, velcro together five power cables and plug them into one of the pdu strips on the left side of the rack from the top down.

Img_1930_3

9. Take another two sets of four bundled power cables and plug them into the other pdu strips on the right hand side also following the top to bottom convention. You should end up with a balanced distribution of power plugs.

Img_1931_4

10. Take a bundle of six power cables and plug them into the pdu strip on the left hand side.

11. Take another two sets of four power cables and plug them into the two pdu strips on the right hand side.

Img_1936_3

12. Start from the bottom up, bundle the power cables in groups of five. You will end up with two sets of five power cables and a bundle of four.

13. Plug the bundle of four power cables into the pdu on the left hand side.

Img_1939_3

At this point, you can take a step back and admire your work. Hopefully, it looks sort of like this:

Img_1942_43Img_1942_52

Good cabling can be an art form. As in any artistic endeavor, it takes a lot of time, patience, skill, and some imagination. There is no one size fits all solution, but hopefully this post will provide you with some great ideas on your next rack build out.

Standard

Exploring Dynamic Loading of Custom Filters in HBase

Any program that pulls data from a large HBase table containing terabytes of data spread over many nodes will need to put a bit of thought into the retrieval of this data. Failure to do this may mean waiting for and subsequently processing a lot of unnecessary data, to the point where it renders this program (whether a single-threaded client or a MapReduce job) useless. HBase’s Scan API helps in this aspect. It configures the parameters of the data retrieval, including the columns to include, start and stop rows and batch sizing.
 
The Scan can also include a filter which can be the most impactful
way to improve performance of scans of an HBase table. This filter is applied to a table and screens out unwanted rows from being included in a result set. A well-designed filter is performant and minimizes the data scanned and returned to the client. There are many useful Filters that come standard with Hbase, but sometimes the best solution is to use a custom Filter tuned to your HTable’s schema.

Before your custom filter can be used, it will have to compiled, packaged in a jar, and deployed to all the regionservers. Restarting the HBase cluster is necessary for the regionservers to pick up the code in their classpaths. Therein lies the problem – an HBase restart takes a non-trivial amount of time (although rolling restarts mitigate that somewhat) and the downtime is significant with a cluster as big as Flurry’s.

This is where dynamic filters come in. The word ‘dynamic’ refers to the on-demand loading of these custom filters, just like loading external modules at runtime in a typical application server or web server. In this post, we will explore an approach that makes this possible in the cluster.

How It Works Now
Before we dive into the workings of dynamically loading filters, let’s see how regular custom filters work.

Assuming the custom filter has already been deployed to a jar in the regionservers’ classpath, the client can simply use the filter, e.g. in a full table scan, like this

https://gist.github.com/4227003

This filter will have to be pushed to the regionservers to be run server-side. The sequence of how the custom filter gets replicated on the regionservers is as follows:

  1. The client serializes the Filter class name and its state into a byte stream by calling the CustomFilter’s write(DataOutput) method.
  2. The client directs the byte array to the regionservers that will be part of the scan.
  3. Server-side, the RegionScanner re-creates the Scan, including the filter, using the byte stream. Part of the stream is the filter’s class name, and the default classloader uses this fully qualified name to load the class using Class.forName().
  4. The filter is then instantiated using its empty constructor and configured using the rest of the filter byte array (using the filter’s readFields(DataInput) method (see org.apache.hadoop.hbase.client.Scan for details).
  5. The filter is then applied as a predicate on each row.

(1) myfilter.jar containing our custom filter resides locally in the regionservers’ classpath
(2) the custom filter is instantiated using the default Classloader

Once deployed, this definition of our custom filter is static. We can make an ad hoc query using the combination of filters, but if we need to add, extend or replace a custom filter, it has to be added to the regionserver’s classpath and we have to wait for its next restart before those filters can be used.

There is a faster way.

Dynamic Loading
A key takeaway from the previous section is that Filters are Writables – they are instantiated using the name of the class and then configured by a stream of bytes that the Filter understands. This makes the filter configuration highly customizable and we can use this flexibility to our advantage.

Rather than create a regular Filter, we introduce a ProxyFilter which acts as the extension point through which we can load our custom Filters on demand. During runtime, it will load the custom class filter itself.

Let’s look at some example code. To start with, there is just a small change we have to make on the client; the ProxyFilter now wraps the Filter or FilterList we want to use in our scan.

https://gist.github.com/4227336

The ProxyFilter passes its own class name to be instantiated on the server side, and serializes the custom filter after.

https://gist.github.com/4227434

On the regionserver the ProxyFilter is initialized in the same way as described in the previous section. The byte stream that follows should minimally contain the filter name and its configuration byte array. In the ProxyFilter’s readFields method, the relevant code looks like this.

https://gist.github.com/4227524

This is very much like how the default Scan re-creates the Filter on the regionserver with one critical difference – it uses a filterModule object to obtain the Class definition of the custom filter. This module retrieves the custom filter Class and returns it to ProxyFilter for instantiation and configuration.

There can be different strategies for retrieving the custom filter class. Our example code copies the jar file from the Hadoop filesystem to a local directory and delegates the loading of the Filter classes from this jar to a custom classloader [3].

To configure the location of the directory the module searches for the filters.jar in HDFS, add the following property in hbase-site.xml.

<property>
<name>flurry.hdfs.filter.jar</name>
<value>/flurry/filters.jar</value>
</property>
(1) The custom filter jar resides in a predefined directory in HDFS

(2) The proxyfilter.jar containing the extension point needs to reside locally in the regionserver’s classpath
(3) The ProxyFilter is instantiated using the default ClassLoader
(4) If necessary, the rowfilter.jar is downloaded from a preset Path in HDFS. A custom classloader in ProxyFilter proceeds to instantiate the correct filter. Filter interface calls are then delegated to the enclosed filter.

With the ProxyFilter in place, it is now simply a matter of placing or replacing the jar in the Hadoop FS directory to pick up the latest filters.

Reloading Filters
When a new Scan is requested on the server side, this module first checks up on the filter.jar. If this jar is unchanged, the previously loaded Classes are returned. However, if the jar has been updated, the module repeats the process of downloading it from HDFS, creating a new instance of the classloader and reloading the classes from this modified jar. The previous classloader is dereferenced and left to be garbage collected. Restarting the HBase cluster is not required.

The HdfsJarModule keeps track of the latest custom filter definitions using a separate classloader for the different jar versions

Custom classloading and reloading can be a class-linking, ClassCastException minefield, but the risk here is mitigated by the highly specialized use case of Filtering. The filter is instantiated and configured per scan and its object lifecycle limited to the time it takes to do the scan in the regionserver. The example uses the child-first classloader mentioned in a previous post on ClassLoaders that searches for a configured set of URLs before delegating to its parent classloader [2].

Things to watch out for

  • The example code has a performance overhead as it makes additional calls to HDFS to check for the modification time of the filter jar when a filter is first instantiated. This may be a significant factor for smaller scans. If so, the logic can be changed to check the jar less frequently.
  • The code is also very naïve at this point. Updating the filter.jar in the Hadoop FS while a table scan is happening can have undesired results if the updated filters are not backward compatible. Different versions of the jar can be picked up by the RegionServers for the same scan as they check and instantiate the Filters at different times.
  • Mutable static variables are discouraged in the custom Filter because they will be reinitialized when the class is reloaded dynamically.

Extensions
The example code is just a starting point for more interesting functionality tailored to different use cases. Scans using filters can also be used in MapReduce jobs and coprocessers. A short list of possible ways to extend the code:

  • The most obvious weakness in the example implementation is the ProxyFilter only supports one jar. Extending that to include all jars in a filter directory will be a good start. [4]
  • Different clients may expect certain versions of Filters. Some versioning and bookkeeping logic will be necessary to ensure that the ProxyFilter can serve up the correct filter to each client.
  • Generalize the solution to include MapReduce scenarios that use HBase as the input source. The module can load the custom filters at the start of each map task from the MR job library instead, unloading the filters after the task ends.
  • Support other JVM languages for filtering. We have tried serializing Groovy 1.6 scripts as Filters but performance was several times slower.


Using the Proxyfilter as a generic extension point for custom filters allows us to improve our performance without the hit of restarting our entire HBase cluster.

Footnotes
[1] Class Reloading Basics http://tutorials.jenkov.com/java-reflection/dynamic-class-loading-reloading.html
[2] See our blog post on ClassLoaders for alternate classloader delegation http://tech.flurry.com/using-the-java-classloader-to-write-efficient
[3] The classloader in our example resembles the one described in this ticket https://issues.apache.org/jira/browse/HBASE-1936
[4] A new classloader has been just been introduced in hbase-0.92.2 for coprocessors, and it seems to fit perfectly for our dynamic filters https://issues.apache.org/jira/browse/HBASE-6308
[5] Example code https://github.com/vallancelee/hbase-filter/tree/master/customFilters

Standard

Squashing bugs in multithreaded Android code with CheckThread

Writing correct multithreaded code is difficult, and writing Android apps is no exception. Like many mobile platforms, Android’s UI framework is single threaded and requires the application developer to manage threads with no thread-safe guarantee. If your app is more complicated than “Hello, World!” you can’t escape writing multithreaded code. For example, to build a smooth and responsive UI, you will have to move long running operations like network and disk IO to background threads and then return to the UI thread to update the UI.

Thankfully, Android provides some tools to make this easier such as the AsyncTask utility class and the StrictMode API. You can also use good software development practices such as adhering to strict code style and requiring careful code review of code that involve the UI thread. Unfortunately, these approaches require diligence, are prone to human error, or only catch errors at runtime.

CheckThread for Android

CheckThread is an open source project authored by Joe Conti that provides annotations and a simple static analysis tool for verifying certain contracts in multithreaded programs. It’s not a brand new project and it’s not very difficult to use, but it hasn’t had a very high adoption for Android apps. It offers an automated alternative to exclusively using comments and code review to ensuring no bugs related to the UI thread are introduced in your code. The annotations provided by CheckThread are: @ThreadSafe, @NotThreadSafe, @ThreadConfined

ThreadSafe and NotThreadSafe are described in Java Concurrency in Practice, and CheckThread enforces the same semantics that book defines. For the purposes of this blog post, the only annotation that we’ll be using is ThreadConfined.

Thread confinement is a general property of restricting data or code to access from only a single thread. A data structure confined to the stack is inherently thread confined. A method that is only ever called by a single thread is also thread confined. In Android, updates to the UI must be confined to the UI thread. In very concrete terms, this implies that any method that mutates the state of a View should only be called from the UI thread. If this policy is violated, the Android framework may throw a RuntimeException, but also may simply produce undefined behavior, depending on the specific nature of the update to the UI.

CheckThread supports defining thread policies in XML files, so while it would be possible, it’s not necessary to download the source of the Android framework code and manually add annotations to it. Instead, we can simply define a general thread policy to apply to Android framework classes.

Time for an Example

The following example demonstrates how to declare a thread policy in XML, annotate a simple program and run the CheckThread analyzer to catch a couple of bugs.

CheckThread’s XML syntax supports patterns and wildcards which allows you to concisely define policies for Android framework code. In this example we define two Android specific policies. In general this file would contain more definitions for other Android framework classes and could also contain definitions for your own code.

The first policy declares that all methods in Activity and its subclasses that begin with the prefix “on” should be confined to the main thread. Since CheckThread has no built-in concept of the Android framework or of the Activity class we need to inform the static analyzer which thread will call these methods.

The second policy declares that all methods in classes ending with “View” should be confined to the main thread. The intention is to prevent calling any code that modifies that UI from any other thread than the UI thread. This is a little bit conservative since there are some methods in Android View classes that have no side-effects, but it will do for now.

https://gist.github.com/4113656

Having defined the thread policy, we can turn to our application code. The example app is the rough beginnings of a Hacker News app. It fetches the RSS feed for the front page, parses the titles and displays them in a LinearLayout.

This first version is naive; it does network IO and XML parsing in Activity.onCreate. This error will definitely be caught by StrictMode, and will likely just crash the app on launch, so it would be caught early in development, but it would be even better if it were caught before the app was even run.

https://gist.github.com/4113662

In this code, we make a call to the static method getHttp in the IO utility class. The details of this class and method are not important, but since it does network IO, it should be called from off the UI thread. We’ve annotated the entire class as follows:

https://gist.github.com/4113669

This annotation tells CheckThread that all the methods in this class should be called from the “other” thread.

Finally, we’re ready to run the static analyzer. CheckThread provides several ways to run the analysis tool, including Eclipse and Intellij plugins, but we’ll just use the Ant tasks on the command line. This is what CheckThread outputs after we run the analyzer:

https://gist.github.com/4113676

As you can see, CheckThread reports an error: we’re calling a method that should be confined to the “other” thread from a method that’s confined to “MAIN”. One solution to this problem is to start a new thread to do network IO on. We create an anonymous subclass of java.util.Thread and override run, inside of which we fetch the RSS feed, parse it and update the UI.

https://gist.github.com/4113683

We’ve annotated the run method to be confined to the “other” thread. CheckThread will use this to validate the call to IO.getHttp. After running the analyzer again, we discover that there’s a new error reported:

https://gist.github.com/4113686

This time, the error is caused by calling the method setText on a TextView from a different thread than the UI thread. This error is generated by the combination of our thread policy defined in XML and the annotation on the run method.

Instead, we could call the Activity.runOnUiThread with a new Runnable. The Runnable’s run method is annotated to be confined to the UI thread, since we’re passing it to a framework method that will call it from the UI thread.

https://gist.github.com/4113689

Finally, CheckThread reports no errors to us. Of course that doesn’t mean that the code is bug free, static analysis of any kind has limits. We’ve just gotten some small assurance that the contracts defined in the XML policy and annotations will be held. While this example is simple, and the code we’re analyzing would be greatly simplified by using an AsyncTask, it does demonstrate the class of errors that CheckThread is designed to catch. The complete sample project is published on Github.

The Pros and Cons of Annotations

One drawback that is probably immediately obvious is the need to add annotations to a lot of your code. Specifically, CheckThread’s static analysis is relatively simple, and doesn’t construct a complete call graph of your code. This means that the analyzer will not generate a warning for the code below:

https://gist.github.com/4113695

While this may appear to be a significant problem, it’s my opinion that in practice it is not actually a deal breaker. Java already requires that the programmer write most types in code. This is seen by some as a drawback of Java (and is often cited incorrectly as a drawback of static typing in general). However there are real advantages to annotating code with type signatures, and even proponents of languages with powerful type inference will admit this, since it’s usually recommended to write the type of “top-level” or publicly exported functions even if the compiler can infer the type without any hint.

The annotations that CheckThread uses are similar; they describe an important part of a method’s contract, that is whether it is thread safe or should be confined to a specific thread. Requiring the programmer to write those annotations elevates the challenge of writing correct multithreaded code to be at the forefront of the programmer’s mind, requiring that some thought be put into each method’s contract. The use of automated static analysis makes it less likely that a comment will become stale and describe a method as thread safe when it is not.

The Future of Static Analysis

The good news is that the future of static analysis tools designed to catch multithreaded bugs is looking very bright. A recent paper published by Sai Zhang, Hao Lü, and Michael D. Ernst at the University of Washington describes a more powerful approach to analyzing multithreaded GUI programs. Their work targets Android applications as well as Java programs written using other GUI frameworks. The analyzer described in their paper specifically does construct a complete call graph of the program being analyzed. In addition, it doesn’t require any annotations by the programmer and also addresses the use of reflection in building the call graph, which Android specifically uses to inflate layouts from XML. This work was published only this past summer, and the tool itself is underdocumented at the moment, but I recommend that anyone interested in this area read the paper which outlines their work quite clearly.

 

Standard

Write Once, Compile Twice, Run Anywhere

Many Java developers use a development environment different from the target deployment environment.  At Flurry, we have developers running OS X, Windows, and Linux, and everyone is able to contribute without thinking much about the differences of their particular operating system, or the fact that the code will be deployed on a different OS.

The secret behind this flexibility is how developed the Java toolchain has become. One tool (Eclipse)  in particular has Eclipsed the rest and become the dominant IDE for Java developers. Eclipse is free, with integrations like JUnit support, and a host of really great plugins making it the de facto standard in Java development, displacing IntelliJ and other options.  In fact, entry level developers rarely even think about the compilation step, because Eclipse’s autocompilation keeps your code compiled every time you save a file.

There’s Always a Catch

Unfortunately no technology is magical and while this set up rarely causes issues, it can. One interesting case arises when the developer is using the Eclipse 1.6 compiler compliance and the target environment uses Sun’s 1.6 JDK compiler.  For example at Flurry, during development we rely on Eclipse’s JDT Compiler, but the code we ship gets compiled for deployment on a Jenkins server by Ant using Sun’s JDK compiler. Note that both the developer and continuous integration environment are building for Java 6, but using different compilers. 

Until recently this never came up as an issue as the Eclipse and Sun compilers, even when running on different operating systems, behave almost identically.  However, we have been running into some interesting (and frustrating) compiler issues that are essentially changing “Write Once, Run Anywhere” into “Write Once, Compile Using Your Target JDK, Run Anywhere.”  We have valid 1.6 Java code using generics, which compiles fine under Eclipse, but won’t compile using Sun’s javac.

Let’s See an Example

An example of the code in question is below. Note that it meets the Java specification and should be a valid program. In fact, in Eclipse using Java 1.6 compiler compliance the code compiles, but won’t compile using Sun’s 1.6 JDK javac.

https://gist.github.com/4035987

Compiling this code using javac in the Sun 1.6 JDK gives this compiler error:

https://gist.github.com/4036089

“Write Once, Run Anywhere” never made any promises about using different compilers, but the fact that our toolchain was using a different compiler than our build server never bore much thought until now.

Possible Solutions

The obvious solution is to have all developers work on the same environment as where the code will be deployed, but this would defer developers from using their preferred environment and impact productivity by constraining our development options. Possible solutions we have kicked around :

  1. Have ant compile using the Eclipse incremental compiler, (using flags  -Dbuild.compiler=org.eclipse.jdt.core.JDTCompilerAdapter and of course -Dant.build.javac.target=1.6). This side steps the problem by forcing the continuous integration system to use the same compiler as developer laptops, but is not ideal as this was never an intended use of the Eclipse compiler. 
  2. Move to the 1.7 JDK for compilation, using a target 1.6 bytecode. This solves this particular issue, but what happens in the future?
  3. Change the code to compile under Sun’s JDK. This is not a bad option but will cost some speed of development found in the simplicity of Eclipse’s built in system. 

My experience has been that Eclipse is a well worn path in the Java world, and its a little surprising that this hasn’t come up before for us given the heavy use of generics (although there are lots of generics issues which have been logged over at bugs.sun.com, like http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6302954 which has come up for us as a related issue – the “no unique maximal instance exists” error message). 

Switching to use the Eclipse compiler for our deployable artifacts would be an unorthodox move, and I’m curious if anyone out there reading this has done that, and if so, with what results.

We had a discussion internally and the consensus was that moving to 1.7 for compilation using a target 1.6 bytecode (#2) should work, but would potentially open us up to bugs in 1.7 (and would require upgrading some internal utility machines to support this).  We aren’t quite ready to make the leap to Java 7, although its probably time to start considering the move. 

For now, we coded around the compiler issue, but its coming up for us regularly now, and we are kicking around the ideas on how to resolve.  In the near term, for the projects that run into this generics compile issue, developers are back to using good ole ant to check if their code will “really” compile.  Its easy to forget how convenient autocompilation has become, and the fact that it isn’t really the same as the build server’s compiler.

Standard