1. 程式人生 > >Ask HN: What topics/subjects are worth learning for a new software engineer?

Ask HN: What topics/subjects are worth learning for a new software engineer?

Not the person you're replying to, but I'll share one of my favorites:

Given an address such as "123 Main St, Boston, MA 00215", break it up into its component parts of: street number, street name, city, state, zip.

(Note: I usually just ask for pseudocode, but if they want to write real code, they can.)

The reason I like The Address Problem (TM) is that it's a real problem I've solved before. I've built address verification systems for some large US companies that you've almost certainly heard of. Most companies in the consumer space deal with some sort of address input, whether it's StitchFix deliveries or Redfin for real estate. Most of those companies outsource it, but still... you store customer addresses in your database, right? :)

I also like it because the domain knowledge is already known for anyone who has spent any amount of time in the United States (where I reside). It's intuitively obvious to any American or even recent immigrant which part of the address is the street number, which part is the city, and which part is the zip code. You don't have to know anything special coming into it -- you don't need to know what a binary tree is or what a linked list is or any of the bajillion other CS concepts that I learned in college and see in interview questions and have yet to use in actual real-world use cases. All you need to know is how to read the front of an envelope or an entry on Yelp.

Another reason I like it is that there are some really obvious ways to go, and there are some really complex ways to go. Arriving at a solution that _works_ isn't really the point. Any programmer out of boot camp can do that, and there are a billion complexities to addresses that most people don't think about. What's more interesting to me is how you arrive at your solution and is that solution something that you can build upon to handle various edge cases.

For example, a lot of candidates will say "I'd use a regular expression," at which point I'll ask them to write one. They usually struggle a bit with it, but even if they do get it, I throw a curveball: "What if some inputs have no commas?" Then they have to modify their regex to handle comma and comma-less addresses.

Some candidates will start off looking at the beginning of the string and saying something like "I'll look for a substring of Street/St or Avenue/Ave or similar", to which I can throw the curveball of "123 Main Street E, Boston, MA 00215" or a more fun one, "123 St Francis Ave, Boston, MA 00215"

I really like the address question because of the endless curveballs I can throw at it. What happens if the user leaves off the ZIP code? What if they abbreviate the city to "SF" instead of "San Francisco"? What about apartment numbers? How do you handle pre-directions and post-directions? What's the city for "123 Main Street West Palm Beach FL 33409" and how do you know?

What reasonable assumptions can you make about an address? For example, there are 50 states + a handful of territories, so states should _theoretically_ be easy to parse out... What other assumptions can we make about addresses? (Gotta be careful with this one -- people often assume that street names end in "Street" or "Avenue" or their abbreviations and forget about post-directions!)

Since the domain knowledge is so simple even a 5th grader understands, the real challenge comes down to problem solving. There's no "trick" to it. You just work through the problem until you get stuck or we run out of time. Giving hints is fairly straightforward as well.

At the end of this part of the interview, I often allow them to give me some thoughts on how they did. If they could do this whole 30 minutes over again, what would they do differently? Maybe not use regex? :) How do they think the US Postal Service handles addresses? Or Google?

Again, there's no right or wrong answer for these. I'm more interested in whether they exhibit self-awareness, whether they can identify what went wrong earlier, and whether they can learn from their mistakes.

(In case you were wondering, the _real_ answer I'm hoping for is "I'd ask Google Maps." I've had exactly two people give me this response in the 8+ years I've been giving this challenge. I don't know if it's because they think it's a dumb answer, if they know that I expect a code-related answer, or if it never occurs to them... but I like to think someone who wants to see what Google says isn't the type that constantly wants to reinvent the wheel. Note that NOT giving this answer is not a deal-breaker. I generally give the benefit of the doubt to candidates here.)