27 Apr 2017 c.e.
Roadblocks on the Way to Refactoring

tl;dr how good is your mental map of the territory?

My three month 'anniversary' at Pluot Communications is quickly approaching. In the time since I've started, I've worked on a dizzying number of projects, most of which has been getting up to speed on how WebRTC and React/Redux works. Recently, a lot of our software contractors rolled off the project (and hopefully onto something else equally squishy). Anyway, with their rolling, I've since more or less acquired responsibility for a large React / Redux project. Part of the work we've been doing at Pluot lately is a big re-skinning job, getting ready for some soon to come feature that I'm not allowed to tell you about yet.

Reskinning is like refactoring the view parts of a website. It seems logical to me then that refactoring the code parts would rightly follow. And they usually do, depending on what, exactly has changed about the spec.

Refactoring a project that is A) large, B) mostly unfamiliar, C) untested, D) written in a dynamic, non-self documenting language (JavaScript) is not one of those things that anyone would largely recommend that you do. I'm here to tell you that you can and you should do it. Here's why.

What should Refactoring accomplish?

I'm going to be honest, I don't know why other people re-write things. A lot of times when I've observed teammates embark on refactorings (which honestly never seems to be as often as I feel like I'm hauling off on them, but I digress) sometimes the code comes out cleaner. Most times it comes out far more confusing and complicated and they claim that they've just saved ... something.

I don't buy white knight refactoring. When I refactor shit I'm usually multiplying code. I refactor things so that I can read them, so that the chance of me breaking something in the future in considerably lessened, so that it's easier to make changes to one part of the code without having to necessarily consider other parts. When I 'refactor' I pull out components, try and keep all of the logic for a thing into a single, canonical, topically or functionally organized place.

I try as hard as possible to decouple view code. View code is notorious for being an arena for subtle bugs. You change a padding in one place and it looks good everywhere except the checkout page where it completely hides the Buy button for certain languages. Of course you checked everywhere, except there, and it took an analyst three days of pondering why the sales numbers in Turkey had taken a plunge in order for someone to figure out what had happened.

As far as I know, I've never done anything like this. But that just shows what I know.

You mentioned earlier that refacotring large, unknown, untested, untyped projects is a good idea. Can you expand on this more?

Sure. When you're faced with changing a codebase that you don't understand, the probability for changes you make introducing bugs is very high. There's a whole taxonomy of where bugs originate from that I won't get into right now, but a large, a very large class of them, fall under the category of problems introduced because you did not understand how the code worked. The root of these bugs then is Your incomplete mental model of the codebase.

So how do you pre-emptively fix this class of bugs, bugs caused because there are incorrect assumptions that you have drawn about how the code works? It may seem obvious, but the easiest way to not make bugs based on bad assumptions is to not hold bad assumptions. How do you know what quality of assumptions you hold? Like any good scientist does, by testing them.

How do you test assumptions about code? There are two ways. One is by closely reading the source document. This is a bit of a pro-level move. The other is by changing things and seeing what breaks and how.

Sometimes I rename variables just so I can see where all it's used. That way I have to touch every single last place that that particular variable appears. It also makes grep a viable tool to help you check when you're finished with a particular task -- when all mentions of the old variable are gone you're done. Grep is a particularly important tool for non-static or non-compiled languages. Back in the day when I did Java codes, IntelliJ or Eclipse would automatically reneame a variable with a few strums of the keyboard. Without that compiler guarantee, it's basically up to you to manually check that every instance of a variable has been updated or changed or whatever. Grep's a good tool for this. If you're new to the grep game, check out git grep -- it only parses files that are checked into git. The only flag I really use for grep is the occasional -n which shows the line number that a particular match was found at. Between grep and vim, I can crank through a number of renamings pretty quickly.

What's the use of knowing every place that a variable is used? It gives you an idea of the scope of that object. If you know every place that a variable is used, your assumptions about that variable are probably more accurate, which reduces the chance of introducing bugs. More than anything, you'll find out really quickly how good your naming scheme is. If you're having trouble figuring out which 'versions' of a variable name are related to each other, maybe you should use more explicit or unique names for things. Asking someone to use a project unique name for every variable is a bit of a tall order, but hey no harm in considering it. After all, you are programming in a language with a shitty toolset -- hacks are expected.

When to Refactor?

You should refactor any time that you encounter code that you do not understand and the person who wrote the code is no longer available to explain it to you. The goal of the refactoring should be simple -- to make it such that you do understand the code.

What if you break the code and can't get it work the way that it originally did?

Let's go back to assumptions. If you're unable to re-write code such that A) you understand it and B) it functions as it previously did, then clearly you don't understand what that code is doing.

In this situation, my advice is to either A) start over or B) give up. If you give up, you better not change anything else related to how that code works as you've just irrevocably proven that you have no idea how it works. Clearly the person who wrote it is a madman who sucks at programming and you should celebrate your defeat by posting the offending code-bit to the code shaming website of your choice. I recommend StackOverflow.

What if you need to make a change to that code but you don't understand how it works?

I'm sorry to say it, but your odds don't look very good here. Making changes to code you don't understand is not something I'd put money on. You really shouldn't either, but this all depends on what you consider an acceptable level of risk.

Have you tried asking for help? I hear StackOverflow is a good place for that.

What does Refactoring accomplish?

Ok so you've totally refactored large swaths of your project and managed to wrangle the three-way merge hell into a single large commit.

Congratulations! You are now the only person who, supposedly, understands how the code works. As the only person who understands how the new code works, you have implicitly become its maintainer into perpetuity.

Isn't that exciting?

#refactoring #maps #wtf
12 Mar 2017 c.e.
*Redaxted*

tl;dr: Learning a new framework sucks, especially if that framework is written in JavaScript.

Dear god there is something absolutely screwed up about the JavaScript code ecosystem, I'll tell you what. If I had to put a finger on a root cause, naively I'd assume that few of the people writing code for JavaScript packages have written or worked in any large system not in JavaScript. Well, actually, I can imagine that they have worked in Ruby land. I'll amend that lemma to be "not in a compiled language, at any rate".

I've spent the majority of this week learning the React/Redux workflow as it's what we use on our webapp at Pluot. Generally speaking, the idea behind the architecture is pretty solid and can summed up thusly: keep all your state in one place, you filthy animal, and pub/sub/observe/notify any updates.

There. You now understand Redaxt.

Have you ever programmed anything in JavaScript? The language is so flexible as to be practically incoherent. Prototypes mean that any and everything is up for grabs at runtime which leads to some amazing flexibility and beautiful abstractions but also chaos and mayhem. There is good cause to complain about the release of a new 'JavaScript' framework -- the number of guarantees you have about the conceptual idioms of that framework are very limited. One side-effect of this is that JavaScript developers have the most finely tuned vocabulary for model-presenter-view you'll find anywhere, I guarantee it.

I'd like to recount for you the suffering I have been subjected to at the hands of the JavaScript ecosystem, as I've been learning how to Redaxt things this week.

Npm

I really have nothing much to say about npm as if you've ever used it you'll understand intuitively my point, but for those of you who have never used it, I will briefly explain the horror.

npm is a package manager for node, the JavaScript runtime that unleashed the horrors of mutability onto server side programming. It is as flakey as all fuck. The common advice for fixing weirdness with your program dependencies is to forcibly remove the entire cache of libraries and redownload them from the Internet whence they came. npm builds have a zero guarantee of repeatability, as the version of your dependencies is updated as published. Isn't that nice.

Webpack

JavaScript is great because it doesn't have a compiler but, sometimes, it's really nice to have a compiler. Also the JavaScript language cabal recently underwent a coup from the Ruby department, known more widely as ES6, and they needed a tool to patch over the ensuing chaos. Yes, more chaos than the normal amount of chaos that comes from an untyped, uncompiled, prototypical, dynamically functional language.

Webpack is the everyman's tool for transpiling your sugary, Ruby-ified JavaScript into something that average Joe Browser can understand. Yes, that is correct. State of the art JavaScript requires translation in order to be practically understood. Webpack does a lot of other things, like serving your pages locally, from memory while you're developing, re-writing output paths, copying your files into weird hard to find places, obfuscation in the most literal of senses, and more.

Given the correct flag, webpack will re"compile" your files when they're changed but I have yet to figure out what heuristic it's using to trigger a rebuild, given how randomly it actually works as intended. Your mileage may vary, but if you figure this out please someone let me know how it's supposed to work. To add insult to injury, the build output doesn't include anything so useful as a timestamp so that you can easily tell when the last time it deigned to run was.

Typos

You would think my shitty typing skills would be something I can't blame JavaScript for, but you'd be wrong. JavaScript has the incredibly dubious distinction of being one of those programming languages that you can make a typo in and not find out about it until you've "compiled" the project, refreshed the page and then attempted to click a button. You can mangle a variable name and nothing will happen, that nothingness being precisely the problem. Silent failures are the worst kind of failure. JavaScript has a fancy transpiler and really great[1] syntax but it can't tell when I've made a really obvious keymash. JavaScript is clearly the language of the futur.

Functional Programming Cargo Culted Fanfic, aka The Redux Docs

Here are a few curated selections:

  • "It's very important that the reducer stays pure." like a virgin, until you touch it for the very first time. Or actually no, don't ever touch it because then it will not be pure and how could you do that to a function?

  • "No surprises. No side effects. No API calls. No mutations. Just a calculation." asking for a boy to love him.

  • "let's start writing our reducer by gradually teaching it to understand the actions we defined earlier", oh noble Pygmalion.

  • "Redux will call our reducer ... for the first time. This is our chance...", your once in a lifetime opportunity!

  • "Here is our code so far. It is rather verbose:" This statement was followed by the least verbose code I've seen. I do not understand what the word 'verbose' means to anyone in this universe.

  • "we can split the reducers into separate files and keep them completely independent" as separate files are a recognized talismic barrier for keeping code from becoming logically entangled.

To be fair, once you get past the many rough edges, JavaScript is an entirely delightful language to write your own universe in. Redux/React is a fairly well thought out system for updating views and maintaining a centralized state mechanism, and I'll probably consider using it on my next JavaScript project.

Or maybe I'll just try Elm. I hear its compiler is very nice.

This post is dedicated to my friend Myk who can't remember the last time he did something that wasn't JavaScript.

[1] Great if you like syntactic sugar bullshit that is the opposite of readable.

#javascript #react #redux
20 Jan 2017 c.e.
The Type of Problem that is an Algorithm

Wherein I Rant About Being Interviewed

I had an in-person interview recently where the day was split up by 'type' of skill that, I, as a 'generic' softwawre engineer (typically this means web) am expected to have acquired by this point: front-end knowledge (JS & CSS), architecture, databases, algorithms, 'culture', a tour of the code-base, and lunch with a few other engineers. Without question, I bombed the algorithm portion, fumbled quite a bit on the front-end piece (in my defense, it has been 5 years since I last made a web UI), and revealed a large knowledge gap about how database indexing works.

Having the skills required to be a web engineer so concretely delineated and feeling like I wasn't really particularly good at any one of them, made me question what skills I had. It certainly wasn't domain specific knowledge of large databases or JavaScript or how to solve algorithms. Yet I have been writing and shipping software for half a decade. What exactly have I been doing, if it wasn't any of these things?

As a mobile dev of the Android variety, I'd say that the types of problems that I've spent a lot of time thinking about are application state[2], manual testing, code architecture, network availability, and view drawing code. I enjoy thinking about the way code fits together.[3]

I've found interviewing to be a struggle because solving algorithm problems is not something I'm good at. I've taken a single online class on them, solved a number of exercises, watched tens of other engineers solve algorithms problems, and am generally familiar with the classes of problems that algorithms fall into. But I haven't, personally, solved a large number of algorithms problems on my own, an experience that I think greatly improves your ability to do so when timed and under pressure.

In thinking through this, I found it helpful to delineate the difference between an algorithm and an architecture question, as a way to better understand what sort of work involves needing to be good at one or the other of these skills.

The Nature of an Algorithm

An algorithm is the precise set of steps that you must take in order to accomplish a task. A solution to an algorithmic problem can be judged on the number of steps that it takes to complete the task and the amount of computer memory that is occupied during the process. Solving an algorithm, then, is a matter of divining the necessary steps.

An algorithm can fail if the defined steps do not return a correct answer for a valid starting condition. Part of the process of finding a solution is identifying potential places that the steps, blindly followed, will arrive at incorrect results.

The Nature of Program Architecture

Architecture is the theater of abstraction. As a topic, it is a wider, and more nebulous category of problem, but generally speaking it is a class of problems that concerns the definition of spheres of responsibility between the various functions of your application. Usually this involves some amount of data flow. When you 'architect' a system you are naming things, assigning work, and defining relationships.

Wrapping Up

The interviews that appreciate the skills that I currently have are the ones that I've done the best at. I didn't end up getting the job at the company who put me through a litany of web-centric problems.[1] It's frustrating to be judged on what you don't know, rather than what you do, but I've been enjoying the opporutnity to better understand what different engineering organizations value in new hires.

And next time, I'll definitely spend time brushing up on algorithms.

[1] I didn't get the job, for reasons cited as "looking for someone with more web experience", at which point I seriously had to question the recruiter's reading comprehnsion skills, as you could have figured that out from my resume in about five seconds. Based on their interview process, they seem to be seeking candidates with a skillset that generally matches that of their existing team.

[2] Android developers tend to be very well versed in state-management by nature of the job.

[3] As much as I hate them, take-home coding exercises seem to be the best place to showcase this particular talent.

#algorithms #architecture #types-of-developers #skillsets
19 Jan 2017 c.e.
What to Ask When You're Being Interviewed

I've been interviewing for jobs in software development over the past few months. My friends mentioned that they liked my rubric of questions to ask founders, engineering managers, and recruiters about their companies, so I thought I'd share it a bit more widely.

What are you building?

I like knowing what I'd be working on, and in my experience, most engineers relish the opportunity to talk about what they're building. For that reason, I find that these questions make for good ice-breakers. These questions are probably no brainers, but for the sake of completeness...

  • Tell me about what you're working on right now.
  • What's the biggest challenge (technical or otherwise) that your team is looking to tackle in the next 6 - 12 weeks?
  • Who are your users?

How Are You Building It?

This is a good follow up to the previous section.

  • How many engineers are on your team?
  • Tell me about the tools you're using.

Where is the Money Coming From?

A great saying I heard once sums up the crux of what this set of questions drives at: if the service is free, you are the product. More concretely, consider Google Search: your attention and pageviews are the product that Google sells to advertisers for money. Your eyeballs are the monetization strategy.

I'm picky about what I want my time to be put towards. Even if my efforts won't be directly related to a money-making project, I like knowing that my time is spent on something I think is ultimately worthwhile (maybe for you Search is worth charging an eyeball tax!)

Ethics or personal morals around money sources aside, I also like knowing that the company has the potential to grow, and that there's a good chance I'd have job 6 months from now. In my experience, companies that do the best on these metrics tend to be B2B.

  • How do you make money?
  • How much VC funding have you taken? (Startups with a lot of VC money but no concrete money-making strategy are sometimes trouble. Consider running. Questionable counterpoint: Twitter)
  • Are you currently profitable?

Who Am I Enriching?

For a publicly traded company, this is fairly straightforward -- the answer is (almost) always the shareholders. But for private companies, which most startups are, the only way to find this out is by asking the founders. For me, it's really important that employees are rewarded for their work. I don't have any great questions that I ask about this; if you have any that you like to ask I'd love to hear them.

The Role of Women In the Org

I love asking about this. The answer is important, but not in the way that you might think. Yes, I ask about women's role in an organization because I'm interested in what a company's representatives will say about it, but, more importantly, I'm using this conversation as an opportunity to gauge how comfortable it is. Gender representation is usually something companies are slightly embarrassed about. For that reason, it's a great topic to judge how well someone handles tough conversations, which is really good information to have when trying to decide where to work.

I'd love to hear how much mileage non-women engineers get out of these questions, too.

  • How many women do you have in management roles, or are considered on the management track?
  • ...on your board, or as advisors? (This is a good question for founders or CEOs, but asking a recruiter or engineering manager will probably get you a rough answer)
  • ...working as engineers on the team that I'm interviewing for?

Interviews are Two-Ways

I really enjoy the opportunity to get to know more about how a company functions, where it sees itself growing, and how teams view their role in that growth. Interviews are a great chance to get an inside scoop on what's up with a company that you find interesting, or just better understand how they see themselves growing (or not...).

You may notice that I don't ask very many questions about their build process or particular languages that they use. I don't ask about their build pipeline, or how long it takes to compile their app (though after recent conversations around this, I'm strongly considering adding this to my list). In my experience, people who think deeply about their organization and their culture also tend to think deeply about what technologies they're using. If you really care about what ecosystem you work in, you should ask about it though.

Did I miss something? Do you have questions that you like to ask? I'd love to hear it (You can reach me on twitter).

#startups #recruiting #what-to-ask
10 Jan 2017 c.e.
Death by Thirst

tl;dr: Drink water before you feel thirsty

A while back, I had the chance to hike the Grand Canyon with some intrepid friends. On our way home, we stopped by the park gift shop where I picked up Over the Edge: Death in the Grand Canyon. Written by Michael Ghiglieri, Over the Edge is a re-telling of all recorded deaths in the canyon. There's falls from the rim vs falls within the canyon, environmental deaths, murder, drowning, plane crashes, animal attacks -- to name a few. It's a pretty gruesome book. However, one detail, taken from the accounts of the deaths from dehydration, really stuck with me.

Death by dehydration or its close cousin, heat exhaustion, is common, especially during the summer months when temperatures within the canyon can surpass 120F. Hikers are advised to carry and consume 2 gallons of water for each day they're on the trail. With few access points to the river, steep walls that take hours to hike out of, and limited resupply points, it's easy to run dry.

However, in some cases, hikers were found dead or near death with water still in their canteens.

You read that right. People died of heat exhaustion with water undrunk.

Why?

The theory is that 1) hikers didn't realize they were dying of heat exhaustion so they were 2) saving their water for when they 'needed' it.

Would drinking the water have saved them? Probably not. But at least they wouldn't have died with water in their canteen. To state the obvious, water in your bottle is useless, as the best time to drink is before you get dehydrated. Plus undrunk water is extra weight to carry.

Thirsty Software

I'll be the first to admit that 'drink the water before you need it' isn't the clearest guideline as far as software maxims go, but I do find it a fun analogy to pull out for making the case for doing preventative work in the codebase, even, if not in spite of, deadlines.

Generally, 'drinking the water' or doing your due diligence and maintenance on your codebase before you need it helps prevent problems later. Due diligence could be updating a library[1], or extracting out an interface, or adding tests to an untested part of the codebase. I've found that by doing this 'preventative work', when a new feature comes down I've got the test coverage or the necessary nuts & bolts available to make the work of adding the new requirement quite trivial.

Unrelatedly, it has been shown that drinking more water while writing software improves the quality of your code. Or at least it's been proven to get you away from your keyboard more often during the day, which one could argue improves the quality of the codebase.

Your mileage may vary.

[1] This isn't an exact analogy. I often wait for library updates to 'mature' a bit before updating. This gives more adventurous members of the community the chance to find bugs before I add it to my production apps.

#foresight #grand-canyon #dehydration
8 Aug 2016 c.e.
Ring Benchmark in Erlang and Go

tl;dr In a head to head test, Go is a better language for writing concurrent code than Erlang

Problem Statement

I'm currently working my way through Joe Armstrong's Programming Erlang.[1] One of the exercises from the chapter on concurrent programming is as follows:

Write a ring benchmark. Create N processes in a ring. Send a message round the ring M times so that a total of N * M messages get sent. Time how long this takes for different values of N and M.

Write a similar program in some other programming language you are familiar with. Compare the results. Write a blog, and publish the results on the Internet

I started off in Erlang. In my tests, I sent one message around a ring of 10,000 processes 10,000 times in 88s.[2] That's a single message that's being passed sequentially, for a total of 100 million message passes. *

I chose Google's Go to run a comparison in, partially because Golang and Erlang look alike on paper (jk). And because their concurrency models share common roots in Hoare's 1978 paper on "Communicating sequential processes".[3]

There are differences, however. From a pure API perspective, Erlang allows processes to pass messages to each other directly, whereas in Golang channels are used to abstract away which process you're passing to.[4] Practically speaking, this means that in Erlang you're creating a ring where each node knows the Pid (Process IDs) of the next node; in Go each node has a channel that is connected to the next node.

After re-writing the program in Go, I found a similar ring (10k nodes by 10k trips around) to finish in about 76s. That's 12s faster than the Erlang version.

I found Go's concurrency models to be easier to work with. The channel paradigm made it easier to reason about passing pieces one to another. Go is moderately object-oriented and procedural, I found that this made it easier to organize the implementation details into structs. In Erlang, I wrote two looping functions -- one for 'child' nodes and one for the 'head' node (that had to keep track of when to exit). In Go, I found it possible to store the data about who was the first node in a node struct, ergo I only wrote one loop. (Though it's probably possible to do the same Erlang, much more elegantly than my attempt.)

One drawback I encountered with Erlang's model of directly passing messages between processes is that I needed to initialize the process before creating the next node. This led to some wonky initialization code -- the loops actually contain two sets of logic: one for initializing the loop and the other for the actual message passing. I found this to be quite messy and harder to reason about than the pattern I wrote up for Go. It's possible that I'd be able to clean up the Erlang code, now that I've got a better handle on concurrency models.

I ran into issues with implementing this in both languages, which I think are worthwhile to point out.

Erlang Problems

In Erlang, I had a lot of trouble putting the 'head' node into a second process. Early versions of my code had the head node loop running in the same process that I was running the program from. I kept getting deadlocks in my main program thread and wasn't sure why. The problem was that I was starting the head_loop in the same process as the terminal -- it was deadlocking waiting to hear back. Spawning a separate process for the head_loop mostly fixed this problem.

This bug led me to discovering this nifty one-liner that checks for messages that the console process has received:

F = fun() -> receive X -> X after 0 -> no_message end end.

Goroutines nicely encapsulate all of the spawn_link and Pid gymnastics that Erlang puts you through.

Go Problems

In Go, I ran into trouble when I tried copying Erlang's switch statement loop syntax. Go gives you the option to construct switch statements for channels, a la:

for {
    select {
      case event := <-ui:
        // process user interface event
      case msg := <-server:
        // process server message
      case t := <-tick:
        // time has elapsed
    }
}

via Go, for Distributed Systems [5]

ddg Here's what my original code for the Loop function looked like:

func (p *Process) Loop() {
    for {
        switch {
            case msg := <-p.RcvMsg:
                // ...
                p.SendMsg <- msg
            default:
                // do nothing        
        }
      }
}

As written, this code runs unreasonably slowly. Passing the message around a single ring of 10,000 nodes takes longer than a minute. The problem is with the default block. As far as I can tell, this makes the loop run constantly. 10,000 for loops running nonstop creates some contention on the CPU, which leads to slowness.

There's a few ways to fix this. One is to add a sleep to the default block.

default:
    // do nothing  for some time
    time.Sleep(5)
}

Even adding as little as 1 ms sleep fixed the problem considerably. To really fix the problem, you can remove the switch statement. The goroutine will block until it receives a message on its receive channel.

This bug highlighted a cool feature of Go: channels block until they receive input (unless they're used in a switch statement with a default). In the right conditions, enough waiting channels in the right configuration will create a deadlock.

Wrapping Up

Given a choice between Erlang or Go for writing a concurrent project, Go would be my go-to. It's measurably faster than Erlang, and it has slightly nicer abstractions for spawning new 'processes'.

The complete solution for these problems can be found on Github: Go & Erlang.

[1] https://pragprog.com/book/jaerlang2/programming-erlang

[2] My laptop is a mid-2011 Macbook Air, with a 1.7 GHz i5 processor and 4GB of RAM.

[3] T. Hoare's CSP paper: http://spinroot.com/courses/summer/Papers/hoare_1978.pdf

[4] This re-implementation of Hoare's paper in Golang has a decent discussion at the beginning https://godoc.org/github.com/thomas11/csp

[5] https://talks.golang.org/2013/distsys.slide#43

#erlang #go #benchmark #concurrency
10 Jul 2016 c.e.
How Long Until A Neural Net Reimplements HTTP

tl;dr The future is already here, it's just not evenly distributed -- William Gibson

This past week at the Recurse Center[1], one of our alums, let's call her Jane, gave a brief, 5 minute presentation on a neural net she had built. Jane was working towards building a neural net that could play pong, and had started by implementing a net that could be trained to replace the Math.round function. Given a decimal number between 0 and 1, the net would round up or down, returning either 1 or 0. Her demonstration was of the training program, a command line program where a trainer inputs a decimal number and the net would tell you what it would round that number to. The trainer then provided feedback to the net, yes you got it right! Or no, that's not correct. Over time, with feedback, the neural net 'learns' how to appropriately round numbers up and down.

On first blush, this appears like a lot of work for something that you could have coded in a few moments. Jane's neural net code was quite large, well over 100 lines. And her net hadn't been fully trained yet -- given 0.43 it incorrectly guessed that this was 1 (most humans could easily tell you this answer: 0).

Jane has been working on neural nets, nonstop, for weeks. She spent a total of 18-weeks at RC, all of them concentrated on learning more about how neural nets work. At 19 weeks now, she has implemented a net that could barely determine whether a number was closer to 0 or closer to 1. (Ok, to be fair it just need more training)

But. Imagine with me for a moment.

What if programming was the task of training a neural net? What if all of our programs looked like Jane's, where we are no longer telling a computer how to round up or down, but rather building a tiny neural net that we train for 0 or 1.

You need a function that can round a number? Train a neural net. Need a function that can verify if a credit card number is valid? Train a neural net[2]. Need a function to validate that a user input a valid email address? Train you a neural net, my son, train you a neural net.

What would a browser look like if it was built by a huge, complicated network of neural nets? Would it understand interface design, and GPU optimization? Would it figure out how to resolve 'http://neuralnetsforever.com' into an IP address? Would it figure out how mouse input works? Or perhaps more importantly, how to render GIFs of cats?

How long do you think it would take to train a neural net to communicate with other computers? What would the neural net version of HTTP look like? Would it also have headers and a body? How would the standards be developed that all nets know how to talk to each other? Would neural nets form their own Internet Engineering Task Force and give it a catchy name?

The Neural Nets Conglomerate For A Common Tongue, perhaps?

Now imagine this technology, neural nets, is available for everyone. Imagine that 95% of all Americans of age 22 know how to train and work with neural nets, as it's something they've been doing since they were 10. In this future, who writes Math.round functions anymore?

I'd wager the answer is no one.

Programmers joke a lot about coding ourselves out of a job, but we haven't succeeded at doing it yet. It's a common speculation, but in the past 50 years or so, we haven't made much progress. Either we're not very good, or we're doing it wrong. Which may explain why not many people seem to know what it would look like, exactly, to lose your job to your code.

I'm here to tell you that it looks like a thousand tiny training programs. I am here to tell you that your replacement, programmer, is neural nets. When neural nets can communicate with each other, when they can reverse engineer HTTP, we will have successfully obsoleted software developers, at least of the 'classical' kind.

We will have trained our way out of a job.[3]

[1] http://www.recurse.com

[2] Luhn algorithm wikipedia page

[3] Laugh all you want at those Pokemon trainers, but they've got the right idea. Maybe it's time to get good at training.

#http #neural-nets #future-of-software
10 Jul 2016 c.e.
Your Bugs are Making You Better

tl;dr Bugs are powerful indicators of where your knowledge gaps are.

I work at the Recurse Center (henceforth RC) as a Facilitator. As a facilitator I help people pick projects, organize and run interview prep, keep things in the space moving smoothly, and serve as a semi-permanent, always available pairing or debugging partner.

In this capacity I've had the opportunity to help more than a few people with their bugs, and am pretty familiar with the routine of winding your way down through both the problem space and the source code, in an attempt to figure out what's going on.

As an active software developer, I also encounter my own bugs or problems from time to time. In solving these problems, I learn things, most of which could be succinctly summed up as "How my idea of the world is not what actually exists".

This past weekend, I had the pleasure to ride along on a meet & greet with a prospective resident for RC (let's call her Jane). Jane had been the CTO at a financial trading firm in NYC and taught classes sometimes at NYU. Her advice for RC was that we should build out a common environment, as her financial firm had done, that was built on top of a linux distribution and could be downloaded and easy to set up for any person in RC. They'd be able to develop their software in it and then easily push it out to a variety of places. Her rationale behind this was that a skill her firm looked for in developers was their ability to deploy. By providing pre-made packages for Recursers, we would help them learn the entire deployment stack, as opposed to how we do help people deploy things currently, which is pushing things out to Heroku.

First, I have to say that I was opposed to this idea mostly because of the amount of work it would add to my list of responsibilities. Keeping a linux package up to date for anyone to download is work, and I'd rather not do any more work than I have to.

Secondly, and more ideologically, providing students with a 'clean-room' version of Python (for example) that just works sounds nice in theory, but in practice runs counter to the very idea that Jane had been proposing in the first place -- that providing a box that goes "all the way down" would be more 'learning-rich' than just 'pushing to Heroku'.

In my experience, the biggest learning comes in two ways: 1) doing a thing for yourself or 2) encountering bugs. Providing a working dev environment to all participants at RC effectively removes both of these opportunities for learning. Do Recursers encounter problems installing Python? Yes, sometimes. Do Recursers sometimes waste lots of time getting the right dependencies installed for OpenCV or Pandas, or some other technology? Yes, definitely, all the time.[1] Do Recursers get frustrated by the inability to make actual 'progress' on a project because they're mired in dependency or install hell? Without question.

However. Inability to resolve problems, be it in code or in package downloading, is indicative of not fully understanding what and how something works. I agree with Jane: programmers should know how their package managers work, as far down as they need to in order to successfully troubleshoot problems when they encounter them. If you don't understand what a PATH[2] variable is, then having an environment where that is already setup for you perfectly won't help you to figure out how to add new environments to it. Further, it keeps you from understanding how to modify and change your computer such that you can add new programs (maybe ones that you wrote) to the PATH, to be used anywhere.

This is a key point of the power of knowledge -- knowing how things work enables you to change and modify them as you see fit. Bugs are a gateway to that knowledge, and removing bugs and problems from people's path is really taking away a great learning opportunity. Which as a facilitator, runs counter to my job.

So the next time you can't get a package to install, instead of getting frustrated, think of all the marvelous things you're learning! Either about how computers work, or the architecture decisions of the package maintainer, or maybe even the politics that went into whatever it is that is causing you grief at the moment.

Maybe it's not the most marvelous thing to waste your time with, but the silver lining's not so bad.

[1] To a large extent, I am ignoring the other, larger problem that the nature of the Recurse Center presents: that Recursers work on a variety of diverse projects, and the requirements for what a 'common' machine would need in order to be useful to even half of the participants at any given time is daunting, both from a maintenance perspective and a requirements gathering project.

[2] A PATH variable is a Unix-shell[2.a] construct that allows you to run programs from any directory, regardless of where that actual program lives on your computer. This is like being able to start the Chrome app while you're digging thru the Trash Bin.

[2.a] If you are a Mac user, your Unix-shell is Terminal.

More...