Andrew Burke has been a professional independent developer for over 20 years, working in everything from HyperCard and Lotus Notes to Ruby on Rails and iOS. Besides building software for various businesses, he teaches web development, speaks at conferences, and has several SaaS products and iOS apps on the side. In his spare time, he also does fan art mash-ups of iconic science fiction ships and characters with equally iconic Nova Scotian scenery – which are surprisingly popular in Halifax.
Mike Hayes is a certified coach, teacher, and speaker with the John Maxwell Team and the president of Changing Leaf, a leadership development company dedicated to developing better leaders. He’s also the co-author of Dreaming Big Being Bold 2: Inspiring Stories from Trailblazers, Visionaries and Change Makers.
Once upon a time I wrote a computer program that did not require data. It was called helloworld.exe and it was awesome. It was also a wee bit useless. The essence of useful software is taking input, doing something with it, and spitting something out.
There is lots of data out there. My first professional gig after university was consuming text files dumped out by a COBOL app. (They were delicious.) You can parse XML, munge HTML, manipulate images, or decipher networking protocols. But despite the variety of possible data sources, we end up sticking a lot of our data in relational database systems.
And then we write a lot of code for moving data in and out of the database. And then we complain about how much data access code we have to write and maintain. In Survival Skills for Developers, the first item for your basic survival pack is a data access toolkit (homegrown or open source or commercial). The reason is not to be trendy or to sound up-to-date at developer conferences. The point is to make better use of your time by relying on frameworks and libraries to do some of the heavy lifting for you.
Do Not Be Afraid
In the .NET programming world, ADO.NET is the underlying data access technology. Many data access patterns and frameworks have been built on top of ADO.NET and yet scores of developers still write ADO.NET data access code the way they learned nearly a decade ago. What a waste.
There is no reason to be afraid of modern data access techniques. You do not have to rewrite your existing codebase. .NET programmers can still use classic ADO.NET when it’s expedient. You can mix and match tools and frameworks. I have a small project that I’m transitioning from SubSonic (used during prototyping) to NHibernate. Currently the code features a mix of SubSonic and NHibernate as I transition and it works just fine.
Frankly I think developers who refuse to explore alternatives are being irresponsible. I’m not holding up any specific frameworks, toolkits, or approaches as best. I’m simply saying that a failure to be well informed about options is intellectual laziness and we cheat our employers, clients, and stakeholders when we insist on writing everything by hand every time we need a record from the database. We would question the judgment of a house builder relying on all manual tools, yet we mindlessly churn out the same data access code over and over and over again.
Stop Freaking Out About Inconsequential Performance
A mental roadblock for many folks is a worry about performance going down the toilet by relying on something like an ORM (object-relational mapping) tool. If there is one serious mental shortcoming amongst programmers, it is our obsessive need to prematurely optimize everything even when there are ample computing resources to deal with our less-than-completely-efficient code.
Stop freaking out.
The JVM and the .NET runtime have proven that letting the computer handle some tedious work is totally worth it even if it’s less efficient than what you could (theoretically) write by hand.
CPU time is cheap. Programmer time ain’t. Spend time optimizing code only when it becomes necessary.
Do You Really Need a Database?
Simple lists of objects can be similar to tables of records. The relationships between objects can be similar to the relationships between tables. Nevertheless, there is a mismatch between object oriented programming and relational data (perhaps you’ve heard of the object-relational impedance mismatch). Depending on your situation, you might not really need a relational database. Check out some of the technologies in the NoSQL side of the world for alternatives to relational database storage engines.
There are a lot of options out there for working with relational data. Check out Barry Gervin’s article All I Wanted was My Data for some options.
What are your suggestions? Drop them in the comment box.
Let’s suspend reality for a moment and pretend you’re heading out into the woods this weekend. Set aside the fact that you are a software developer and have no business tromping around out in the wilderness. You’re going to want to take along a few basics: maybe some matches, a tent or shelter of some sort, a bit of food, and some dry clothes. To survive you need the right tools / supplies and the skills to use them.
(I bet you see where I’m going with this.)
Back at the office, where your mouse and keyboard fret over your safe return, there awaits an entirely different survival scenario:
Survivor: Cubicle. Outcode. Outbuild. Outlast.
Surviving as a software developer is more than stringing together some lines of code that read and write from a database. Sure, those are basic skills. To survive in the woods you obviously need to walk and breathe, but you also need to start a fire and build a shelter.
The following 8 items form a basic survival pack that can get you through most modern software development forests:
1. Data access toolkit (homegrown or open source or commercial)
If reading and writing from a database is a basic skill like walking, then a good data access framework is your walking stick. I’m a .NET guy so I’ve worked with NHibernate, SubSonic, Entity Framework, and a couple homegrown solutions. An ORM tool is not necessary. Use the core ADO.NET classes directly if that works for you. What you absolutely should avoid is writing all your data access code from scratch every time you need to work with data.
Personally I recommend you learn the basics of several modern data access toolkits and learn about some of the code generation solutions available on the market so you can make informed choices in your projects.
OK, let’s crank the controversy meter up one notch. Regular expressions are like waterproof matches. Sure you don’t need them, but they can make things a heck of a lot easier in some circumstances. When it comes to text processing, there are lots of approaches that can work, but a regular expression can turn an arduous coding exercise into a single line of code.
If you use Visual Studio, I highly recommend you explore the regular expression syntax available for the Find dialog box. It is amazingly useful once you learn the wonky syntax.
3. Unit testing
Let’s try one more notch higher on the controversy meter. You must be able to write and execute unit tests for the language and platform you are developing on. Unit testing is like a compass. You don’t always need it and you won’t need it constantly, but it can be priceless in many situations.
Unit testing has its fair share of zealous proponents and vehement haters. I like to think I have a nice moderate middle-of-the-road perspective. Unit testing is incredibly important in some circumstances. I am not an adherent to the Church of Test Driven Development. I do not usually attempt to achieve complete code coverage in unit tests. But without the ability to write and execute unit tests, you risk wasting precious time walking in circles as you test the same things over and over again manually.
4. Basic printing, reporting, and charting
Not every system has to print or produce charts and reports, but lots do. Even web site developers have to consider how their pages will look when printed and perhaps generate PDF documents for printing. Regardless of whether you write desktop apps, web apps, or middleware, you should be able to print and generate some basic reports.
If you are a .NET developer like me, check out some of the vendor solutions like Telerik Reporting, ActiveReports, LogiReport, and XtraReports.
Printing, reporting, and charting are about user needs. Users need to print invoices and put charts of projected vs. actuals into the shareholder report. Sometimes you need paper to get your business done.
5. Internet: sending email and downloading HTTP content
Back in the last century, we invented this thing called the Internet. It’s nice. It lets software programs on different systems communicate with each other. And the Internet has this way of creating new opportunities so you never know when your software will have to become Internet-aware.
Sending email and accessing HTTP content are pretty easy tasks to accomplish in most languages given the plethora of libraries available (e.g., classes built into the .NET framework). And even if your current coding project does not require Internet access, you might see new opportunities once you’ve learned how easy it is. For example, you might decide to automatically email crash reports to your help desk.
If you pass other hikers in the woods, you might not need to talk to them, but then again, maybe you will. Better to be prepared.
6. HTML and basic CSS
Regardless of the type of software you build, I firmly believe you should know a little bit about HTML, XHTML, and CSS (and some XML would be good, too). If you don’t build web applications, you don’t need to master these technologies, but a grasp on the basics will come in handy. Examples:
- creating HTML formatted emails to send to users
- updating a development team web site
- customizing a product wiki
- writing end user documentation
- updating the company web site (happens in small companies)
- setting up a blog for the marketing department
(X)HTML + CSS is the lingua franca of the web so just learn it already – no fancy survival metaphor required.
7. Scripting or command line development
Many software components need to work with standard input / output streams or simply do not require a graphical user interface. If you only know how to build Windows or web applications, you are severely limiting your ability to efficiently handle the many small tasks that often accompanying development work like parsing a file or deploying software updates.
If you work in Windows, learn to use the command line along with some basic VBScript and batch (.bat) file "programming."
If you work in a Linux / UNIX environment, you probably already know the things you need to learn. In case you don’t, I suggest digging into some shell scripting, sed, awk, and grep after you’ve mastered the command line fundamentals.
If you work in an OS X environment, you just need Photoshop. No, I’m kidding. Learn the UNIX utilities.
Decent command line and scripting skills are your ferrocerium – more hardcore than a match and much more durable.
8. Services, daemons, and cron
Sometimes software just needs to run on its own. If you’re a Windows developer, learn how to build a Windows Service and how to run programs with the Task Scheduler. If you’re in a Linux or UNIX environment, learn how to write a daemon and schedule cron jobs. Even if you don’t specifically need a Windows Service or a daemon process in the foreseeable future, understanding the concepts will make you a better programmer. Unlike desktop apps that can be easily restarted or web apps in which code executes in short bursts, daemons and services are long-running processes that require careful creation.
Thinking about responsible resource management in a long-running process will make you a better programmer, just like building a campfire in the wilderness from collected wood will make you that much better at building your next campground sing-along-and-make-some-s’mores campfire.
Let the Comments Flow
Agree? Disagree? Think I missed something? Leave a comment and let’s discuss.
(photo credits: Alaskan Dude)
While doing some Facebook Connect development, I found that the expected cookies were not being set when developing on localhost. To fix the problem, I added localhost.local to my hosts file (pointing at 127.0.0.1) and changed the settings for my Facebook application to use localhost.local as the base domain.
I’ve been doing some work with ASP.NET MVC but was having periodic issues with Visual Studio 2008 hanging (freezing / becoming unresponsive) when I tried to run my web application with debugging. The problem only occurred with a specific web project.
I tried deleting the Temporary ASP.NET Files (%userprofile%\AppData\Local\Temp\Temporary ASP.NET Files) but that did not solve the problem.
I tried deleting the obj folder. No luck.
I tried waiting it out one evening. I eventually feel asleep and when I awoke sometime in the middle of the night, my web app was happily waiting for input and Visual Studio debugging was completely responsive. That proved to be a temporary salve.
The problem resurfaced again a few days later and I finally figured out the problem with my uncooperative debugger. I have a folder containing approximately 20,000 images that are not included in the Visual Studio web project but are sitting in a directory in the web site. I had turned on "Show All Files" in Solution Explorer to add some script files into the project. When "Show All Files" is off, F5 (Start Debugging) works like champ. When "Show All Files" is on, Visual Studio becomes unresponsive. In reality it is not "frozen" but simply taking a very long time to process those 20,000 image files.
Watir, pronounced water, is an open-source (BSD) library for automating web browsers. It allows you to write tests that are easy to read and maintain. It is simple and flexible.
Watir drives browsers the same way people do. It clicks links, fills in forms, presses buttons. Watir also checks results, such as whether expected text appears on the page.
A great use for Watir is to automate tedious form filling during development / developer testing. Letâ€™s say you have a simple form like the following to submit a search query to Google:
<form id="search_form" action="http://google.com/search">
<input name="q" value="derekhat" />
<input type="submit" name="submit" />
The Ruby code to submit a form is pretty simply:
The only problem is that this wonâ€™t work in this case. The reason: thereâ€™s an input field called submit that hides the formâ€™s submit method. The workaround is easy:
One of my WordPress-based sites (http://crowdspace.net) was not working when I tried to publish posts containing images from Windows Live Writer. I was getting a 500 Internal Server Error.
I was able to fix the problem by making a small change to the database.
The Error Message
I enabled Failed Request Tracing in IIS7 to find out what was happening server-side to cause the HTTP 500 error. I discovered that the PHP script was trying to insert a row into the wp_posts table with a value of -1 in the post_parent column:
WordPress database error Out of range value for column ‘post_parent’ at row 1 for query INSERT INTO `wp_posts` (`post_author`,`post_date`,`post_date_gmt`, `post_content`, `post_content_filtered`, `post_title`, `post_excerpt`, `post_status`, `post_type`, `comment_status`,`ping_status`, `post_password`, `post_name`, `to_ping`, `pinged`, `post_modified`, `post_modified_gmt`, `post_parent`,`menu_order`,`post_mime_type`,`guid`) VALUES (‘1′,’2009-07-15 14:37:14′,’2009-07-15 17:37:14′,”,”,’2827426439_7b744abd30_m.jpg’, ”, ‘inherit’, ‘attachment’,’open’,’open’, ”, ‘2827426439_7b744abd30_m-jpg’,”,”,’2009-07-15 14:37:14′,’2009-07-15 17:37:14′,’-1′, ‘0’, ‘image/jpeg’, ‘http://crowdspace.net/files/2827426439_7b744abd30_m5.jpg’) made by wp_xmlrpc_server->wp_xmlrpc_server, IXR_Server->IXR_Server, IXR_Server->serve, IXR_Server->call, wp_xmlrpc_server->mw_newMediaObject, wp_insert_attachment
Apparently the database didn’t like thatâ€¦ I investigated further and discovered that the post_parent column in my database was set to BIGINT UNSIGNED. In other words, that database column could not hold the value -1 (unsigned integers are zero or higher).
Some Sanity Checking
I checked several other WordPress databases that I control and found that post_parent column is not UNSIGNED in any of my other WP databases.
So why the difference? All my other databases were created with earlier versions of WordPress and upgraded. The database in question, however, had been created with the latest WordPress release (2.8 at the time). So I popped open the PHP file that defines the database (schema.php) and discovered:
post_parent bigint(20) unsigned NOT NULL default ‘0’,
What’s Going On Here?
When the XMLRPC script uploads an image (called an attachment in WordPress), it uploads it with a post_parent of -1. Then after the post is created, the script updates attachments with post_parent=-1 with the actual ID of the freshly created post.
My guess is that a developer on the project decided to update the schema to UNSIGNED because posts do not have negative values for IDs and the post_parent column references a post ID. Obviously that developer did not realize the special use case in xmlrpc.php.
How Do I Fix This?
Fortunately the workaround is easy. Just run the following database query to change the data type on the post_parent column:
ALTER TABLE wp_posts CHANGE post_parent post_parent BIGINT;
If you don’t know how to run queries against your MySQL database, well, this is a good time to learn. I did it at the command line on my 64-bit Windows server:
cd "C:\Program Files (x86)\MySQL\MySQL Server 5.1\bin\"
mysql.exe â€“u root -p
Enter password: ***********************
Welcome to the MySQL monitor. Commands end with ; or \g.
Type ‘help;’ or ‘\h’ for help. Type ‘\c’ to clear the buffer.
mysql> use crowdspace
mysql> ALTER TABLE wp_posts CHANGE post_parent post_parent BIGINT;
The process is similar on a Linux box. I’m going to go out on a limb here and assume that you can figure out how to run mysql on Linux if you’re running your own Linux server.
If you’re using a shared web hosting service, you should be able to use a web interface such as phpMyAdmin to run this command.
That’s all it took to fix the problem for me. Happy blogging!
In my experience, the majority of estimates and schedules for software development projects are derived from hunches, guesswork, and gut instincts. We say things like “that should take about 4 hours of work” without backing it up with data. And we set timelines based on those estimates without adjusting for past accuracy in estimating. And what about the tendency of the customer / stakeholder to change scope midstream? That should drive how big the “buffer” is in the schedule for changing requirements.
I’m as guilty as the next person in many cases. I fight an internal battle on this issue. Part of me wants to just be a carefree developer with little to no regard for estimates. I dare say that my time as a computer science student was when I got stuck in that trap. Nobody taught me to estimate my work, let alone track the accuracy of my estimates over time. I usually had ample time to do assignments and the code flowed easily for me so I never really worried about how long I spent working on things. [In recent years, I know that people like Rick Wightman at the University of New Brunswick have been working on teaching students how to think more like professional developers.]
The professional developer in me knows that good estimates are essential to everyone in the development chain.
The businessperson in me cherishes accurate estimates.
ACM Queue has an interview with Joel Spolsky in which he talks a bit about evidence-based scheduling (which apparently is somehow supported in FogBugz):
In evidence-based scheduling, you come up with a schedule, and a bunch of people create estimates. And then, instead of adding up their estimates – instead of taking them on faith – you do a little Monte Carlo simulation where you look at what speeds developers had in the past, vis-Ã -vis their estimates. You use that same distribution of probabilities, as we call them, that you had in the past and run a simulation of all of your futures. What you get, instead of a date, is a probability distribution curve that shows the probability that the product will ship on such-and-such a date.
Nice! I need to try that. Sounds so much better than just using spreadsheets to track history.
What’s most interesting and useful about tracking estimate-to-actual history on a per-developer basis is that you get data about how accurate the hunch / gut instinct of each person really is. This is so much more powerful than just collecting team or project based accuracy. Of course there will be outliers in the data, like when a developer who does a lot of similar tasks has to estimate something new or unrelated.
I asked a friend who is a project manager that I respect (and trust) to comment on this. (I was wondering if there was a more popular term than “evidence-based scheduling” or other good tools to support it.) At his old job he used to teach an estimating course for developers. Check out the part I highlighted:
Well when you run a project you have a bunch of planned dates (when are you planning to be done), and you are really supposed to keep track of actual dates (when did the work actually finish). If you are really keen (and perhaps have a database to track this by resource -and perhaps by technology and task type)… you should be able to form some projections on developers… I.e. Billy always takes twice as long to do a design but he does a detailed job so the coding goes twice as fast. You can use this information to do a gut check on teams too… but Joel is right … these estimates are as complex as the people who are on the teams and the types of tasks they are working on. There is also the issue of doing something for the first time… very hard to estimate. It is only when you do something multiple times that you get good at estimating… and only if you are looking back on your estimates looking for trends. What Joel is describing is putting some trends to the old estimate / vs. actuals data.
Not sure what to call it. But PMs are always trying to log history of missing or hitting deadlines and you very quickly get a sense of which developers have good estimates, and which developers you need to double, triple, etc… On my last project I had a developer tell me something would be able to get something done by Friday of that week and it actually took about 3 months of 8 people full time… 😉 I’ve heard the statistic of 8:1 (fastest developer to slowest developer).
To my memory, I have never had a developer highlight his/her accuracy in estimating on a resume or during an interview. That would be an interesting thing to do. It would blow me away if someone told me they had a spreadsheet with their personal estimation accuracy showing their estimates-to-actuals history. That would be an impressive artifact!