Archive for the 'java' Category


Google interview, Microsoft survey, enterprise java, and more…

Tuesday, November 6th, 2007

Well today is the big day. At 16:00 EST I’ll have my first of two phone interviews with Google for a software engineer intern position at Google SMo (Santa Monica for those that don’t speak Googlese). A friend of mine that works at Google SMo was the one that helped me get an interview, and it will be his boss that I interview with for my position. Immediately following the first interview, I’ll have a 15 minute break for a second Google employee calls me to interview me.

I spent some time reviewing some basic algorithms stuff (ie: different sorts, dynamic programming, greedy solutions) but I figure at this point that I’ll just play it safe and be myself. If they aren’t able to accept that not everyone has all the answers then its probably not a good fit for me. Most everyone I’ve talked to that’s either worked for Google or has experience in the field seem to unanimously agree that there’s no use freaking out, just solve the problems :).

On a different note, I received a survey from Microsoft that they send to all potential employees regarding “race and gender equality efforts.” It was basically a 3×5 index card that asked for my gender (or “prefer not to share”) and my race with a “check all that apply” instruction. I filled it out for kicks and we’ll see whether they’re interested enough to fly me to Redmond. I would like to get involved there just because F# is coming out and I specifically asked to work on the F# team. Anyway, here’s to internships *raises beer*.

My new fun adventure is figuring the ever complex and baffling practice that is Java enterprise development. I have pretty much no knowledge of the process of developing a web app in java, other than a faint idea that I need something like Tomcat or Jetty or JBoss or one of the other million frameworks. We’re working with a GWT frontend and its my job to develop a java back end of sorts or some other suitable backend that will talk dirty (or nicely if you like) to the GWT frontend to provide some data.

Anyone got any hints on how to do all that? Just a basic guide is needed, I just need something to jump start me.

Stay tuned, later this evening or early tomorrow I will post two blog entries (one for each interview) about my experiences and the questions I was asked in my Google interview. Thats all folks :)

PS: Welcome to the wordpress version of my blog. Blogger.com was taking too long to update my blog to my domain, so I decided to just convert to wordpress hosted on my domain. Cheers to how simple it was!

My weird CS hangups…

Sunday, August 26th, 2007

I was browsing reddit today and ran across a blog entry by Shannon Behrens. He was discussing odd hangups that different CS people that he knew had, specifically the bright ones. Now I’m not sure I consider myself any sort of beaming, radiant, CS luminary with profound ideas but I consider myself a pretty good programmer.

Shannon mentioned that his major hangup is that he’s an open source fanatic, that is he’ll use a lesser product if it means the difference between using proprietary software and open source software. Now this isn’t a terribly bad hang up in my opinion, though I can see how some might see it as out of hand if you get into the blurry zone of proprietary drivers for video cards and the like. Now Shannon’s hangup sounds a lot like my friend Andrew A.’s hangup. He too is obessesive about using open source software. He is an administrator for a free shell service for developers and with this shell service he uses Debian on the server. Now I offered up my Sun Enterprise 220r as another server for the service to use for users who wanted to play on some Sun hardware. Now it started out that I used Ubuntu on that and though Andrew A. never explicitly asked that I use Debian, he hinted at it because Debian has the “free, open source” philosophy. Long story short, I’m now running Debian on that server.

As for a weird CS hang up for me… I’m zealous about writing concurrent code only in functional or concurrent oriented languages. I refuse to write code that even attempts to be concurrent in anything like C/C++/Java. When I took a course last semester for C & UNIX programming, one of our assignments included fork()/pthreads. It took all my might to make myself write this program in order to fulfill the requirements for the labs. Even at my current job as a research assistant, in order to avoid writing multi-threaded code in Java I converted an ecology computer model into Erlang in order to work with the threading in Erlang instead of Java which turned out to be a fairly large project.

I don’t know what it is, maybe just the pure ugliness of multi-threaded/multi-proc code in imperative or object oriented languages that creates a mental block for me to write that kind of code in languages like Java or C/C++. Note that I’ll write in those languages, just not with concurrency.

Why not Java for scientific computing?

Monday, July 2nd, 2007
As I write this there are hundreds of scientists trudging along with their experiments using FORTRAN (probably F77, not even F90 or F95) and C/C++ to perform their various experiments and analyze their data. While C/C++ of course is arguably the standard in the scientific programming world for creating various experimental models in and Python seems to be gaining its own following as well but why doesn’t Java get a little more attention? Of course younger generations of scientists are willing to embrace Java and seem to be but its the “old school” scientists that are slowing down the uptake of Java into programs. They require that their research partners work with the antiquated systems written several millennia before even the old researcher himself was born. This of course makes development slower than traffic during rush hour in Miami. Don’t get me wrong, C is a great language and FORTRAN, if written properly, looks very nice but both aren’t exactly the quickest languages to develop in. In the discourse that follows I will first lay down some premises that I work with to build the rest of the argument.

First off, we will assume that the researcher will have access to the following:

  1. Reasonably powerful computing systems: In a world of research and the abundance of super computers sitting idle, most researchers will have access to some CPU time on a super computer of sorts. If he does not, he will have to rethink his ability to work on larger scales.
  2. Sufficient resources and time to develop: No decent program can be written hurriedly and automagically assumed to work at the fastest possible speed. Optimization takes time in any language (including FORTRAN, C, and Java) so ample time should be set aside for it.

Now for the meat and potatoes. Java is in many respects very similar to C++. Both are object oriented and provide for a concept of classes. In the modeling world, object orientation makes life incredibly easy when compared to structured languages like C since essentially we can create a “human object” or a “mosquito object” and so on. These objects can have their own qualities and classifications to make them more like the actual object that they’re representing. This makes for excellent code reuse since in Java/C++ you can extend objects to create new objects with similar properties but also differing properties.

In my case I’ve been working in Java on a model that models the bite patterns of mosquitoes in various room sizes and various ratios of humans to mosquitoes. Now I started this project writing in C, a favorite language of mine for some things but for others its just down right painful. It wasn’t long before I got to the point where I was going to have to define how humans and mosquitoes move which is obviously going to be different. A mosquito can’t move near as quickly in a 30 second time period as a human can. So of course I either have to make a generic function that takes an argument of some sort (probably a double or an int) and that function based on the value entered would decide how to progress. Now some might argue that you can just have it perform an operation on the entered argument and just produce movement that way but thats not exactly how it works. Some species’ movement appears “more random” than other species. On a large scale, a human’s movement appears more random than a mosquitoes because the human can cover great distances compared to a mosquito and even though to the human they are moving with a purpose, over a long enough time period, they will still appear random. Mosquito movement on the other hand will appear more uniform. Given a time period of say 1000 seconds, the mosquito will appear to have covered a small area, say a 3×3 space quite uniformly while the human may appear to have covered a 10×10 space more randomly.

Consider this though: what if I’m modeling 20,000 of fish, where some movement patterns are similar between fish and some aren’t. Imagine the nightmare that that generic function would become trying to hash out whether the movement of a fish is more random or more purposeful and the distance it travels in a step in time. Suffice it to say that its not as simple as writing a generic movement function. Yet if you wrote all the different move functions into a C library you’d have an immensely complex piece of code that defined how all these fish moved. With Java instead you define a class for each type of fish and then when you need to change something, you just edit the .java file for that class and change it to your liking. This may seem like a lot of files to handle but I assure you its easier to find a file that you have an idea of the name of rather than trying to find a snippet of code in one big file.

Now this is where the C++ advocates argue that “this is where C++ can do it just like Java.” Well that my be true but in C++ you’ve got to handle your own memory management. Now to a seasoned programmer with experience in C++ this may be no big deal but more often than not most scientists know programming as a by-product of necessity, not desire. They didn’t want to learn programming, they only did because they had to. This can lead to problems when working in C and C++ since the study of memory management in these languages is often just enough to get by with the code they start working on is only brushed up on a little more when they’ve run into a problem they can’t fix the way they wrote it the first time. In Java memory management is handled for you. You let Java’s garbage collector deal with your memory and a lot of time. When I started my model code in C I spent probably 65% of time hunting down memory errors and some may argue that its a lack of C experience and it may very well be but it still makes programming a lot easier when you don’t have to worry about it at all.

Next on our list is the fact that Java will run on any platform that has a JVM available for it (with in reason, crossing JVM’s probably isn’t a great idea). That means if your cluster is a hodge-podge of *nix, Windows, and Macs as long as there’s a compatible JVM for all of them then your code will run without a hitch, or at least it should. Java has a built-in API for handling distributing bits of code to client nodes on a cluster (Remote Method Invocation or RMI* classes in javadocs). Of course you can do this with C or C++ using PVM/UPC/MPI or any number of solutions but when you add all of the hoop-jumping it almost isn’t worth it unless they application is just too large to port without a complete rewrite (which sometimes isn’t a bad idea). RMI aside, its just the fact that you can send that code to someone else and they don’t have to worry if the binary will run or if you’ve interpreted the size of a byte incorrectly for that platform. It will run.

As for Java instead of Python, well thats more of a design preference argument. I think that in many scientific applications like system modeling its good to have static types because it makes the code clear as to what each variable will be used for. I know there are many good arguments for why dynamic types are better but this is merely a design type. I like to know absolutely why my variables are holding. There’s also the fact of Python’s interpreting/pseudo-compiling vs. Java’s bytecode-compiling which I think is a moot point. Its really a design preference when it comes down to it.

That pretty much sums my thoughts on this… and again I may not be right 100% but I’m merely speaking from experience.