• a.bout
  • t.witter
28 Dec 2018 c.e.
The Demo at 50: Looking Forward

December 9th, 2018 marked the 50th anniversary of Doug Englebart's Mother of All Demos. (You can watch the actual demo on YouTube or read about it on Wikipedia). To commemorate the occassion, Doug Englebart's daughter and some of his long time collaborators pulled together an all day symposium for the still surviving demo crew members and other early Internet luminaries. I, like all the other lumpenproletariat of the modern Silicon Valley, bought a ticket to attend.

The day's festivities were held at the Computer Science Museum down in Mountain View, about a forty minute drive from San Francisco early on a Sunday morning. My friend and I arrived early, which gave us time to grab coffee, almost front-row seats at one of the twenty or so ten-person tables that filled the hall that the day's lectures would be held in, ogle the paper signs on tall cocktail tables that marked where the in-person demos of similar tech projects would be held, and traipse down to the first floor museum exhibit, one of Google's prototypes for a self-driving car.

It was mostly a day of reminiscing, with a few more modern speakers talking about projects they're currently working on to make the Web a more annotated and sourceable place. The main drive of most of the projects seemed to be HyperLinking. Ted Nelson, the closing speaker and an early Web researcher, is still going on about how HyperLinks should have been bi-directional.

On the System Itself

There was a panel discussion from a few original ARC researchers. We had a hardware guy, a couple of software guys, and Doug Englebart's daughter, Christina Englebart. The hardware guy, Martin Hardy, had created a hypterlinked diagram to show us all how the original demo computer system had been constructed. The demo itself was held at a hall in San Francisco -- the actual computer mainframe lived in a research center in Menlo Park, south of SF by a few tens of miles. In the demo, the computer screen printout and video feeds from several different cameras are broadcast onto the screen so that we can see researchers in Menlo Park, as well as a camera feed pointed at Doug's face, on stage. In order to get these video streams to show, they had to pipe all the data back to the mainframe in Menlo Park, where the computer composed the stream to feed to projector. They used a microwave tower to beam the feeds, as the Internet hadn't been invented yet. It'd be a few decades until fast speed Internet was installed between here there and everywhere.

Once the reminiscing and story recounting was done, they had a little bit of time to ask to audience for questions. There may have been a few, but the only one I remember was from a man who wanted to know, definitively, what room the Demo had occurred in. Given the spirited debate that follows, it seems that the biggest controversy surrounding the event was the actual location that it happened at. Good thing we have a video recording of it, otherwise we may not be sure that it happened at all.

Another gizmo that came up during the day was the projector machine that the group developed that could stop a film strip on a single frame. You used to not be able to pause film projectors because the heat from the bulb would burn the frame that you stopped on. Anyway, somehow the ARC research group was able to build a projector that would let you stop the film at any arbitrary point. One day, someone was showing the presentation to a group that wanted to know more about the project and happened to stop the film exactly on a frame that showed the computer had crashed. In the middle of the Demo. If you watch the film, you may be surprised to hear this, as you'd know that during the Demo, the whole project works pretty flawlessly. Well, it turns out that it did, in fact, crash. The reason you can't see it when watching the film is one that the digitization process probably lost that exact frame and two that the computer system they built was so incredibly quick to come back online that it restarted without anyone noticing. Turns out that the computer system crashed so frequently that they tuned it to come back so that no one would notice it had even failed. It's hard to square that with how slow my laptop takes to start some days.

Web Researchers, Then and Now

There were a number of great panel discussions about web technologies from a host of different web pioneers. Even Alan Kay made an appearance -- they put him on one of those teleconferencing robots and he beamed in from his home. He got up a few times to get a thing; I wasn't sitting quite close enough to get a good look at the books on the bookshelf behind him.

I think the rowdiest panel was probably the one with Wendy Hall, a UK researcher who's been working on web hyperlinking technology projects since the Demo, and Peter Norvig, the chief researcher for search at Google. There was a strange amount of hostility in the room towards Silicon Valley Money, chiefly coming from the people, a majority in the room to be clear, who had spent their lives in academia and decidedly not made it rich on the Internet and Software boom that came to be after their demos. Unfortunately, I don't remember the exact issues that showcased Hall and Norvig's ideological differences, but I believed it turned around a responsibility to filter out fake news and propaganda. Wendy had done a lot of work on being able to easily show provenance for information, so it was interesting to see her in conversation with Norvig, big wig of Google Search. As an aside, I'm not sure where the line on authoritarianism comes down between censorship and the promotion of truth, but we definitely seemed to be flirting with it. Even Vint Cerf had some strong things to say about the quality of information on the Internet.

Yet another presenter put up on the screen a Mosaic listserve email from Marc Andreesen, one that talked about how he had hacked into the browser the ability to add annotations to any webpage and asking for beta testers.[1] On page annotations seemed to be one of the biggest wishes from the bevy of Internet luminaries we heard from. Well, that and a way to get rid of fake news. Dan Whaley from Hypothes.is was on a panel as well. It was interesting, to me, to see modern efforts to bring annotation to the web. I'm not sure what every website would be like with a comments section, but it seems that the effort to find out hasn't died out yet.

Legacy

One thing that Doug's daughter really brought home for me was the question of what the impact and legacy was of the Demo. The company that bought the technology wasn't able to turn it into a successful product. That wouldn't happen until later, much later, after Microsoft and Apple got their introduction to the mouse and such at Xerox's Palo Alto Research Center. In fact, Doug's ARC project was largely dismantled after the team was bought by Tymeshare. It seems that he had worked hard to open up the lab to researchers from other projects and universities -- almost everyone who was alive and working in the field at the time had, at one point or another, been to the ARC lab to see the software system at work in person. I can't help wonder if it as the collaboration and openness with the lab that led to some of the technological marvels that the group demoed that day in '68 to actually getting out into the world, in some form or another. Sure there were plenty of other insights and research that the team had done, but the reality is that annotations and bi-directional hyperlinks don't have mass adoption in the same way that the mouse and graphical user interfaces achieved.

How much of this idea leakage was due to the work that Doug did to make their projects available to others outside of their group? How much of it was a result of the same researchers ending up at Xerox's PARC which then let Steve Jobs and Bill Gates inside to see what they had built? It's hard to say, exactly.

[1] I wasn't able to find the original email, but Marc himself uses the feature to explain his investment in Rap Genius

#mother-of-all-demos #impressionism #conference-swag
27 Dec 2018 c.e.
Explaining Replace By Fee

I apologize in advance to those readers of mine that have zero interest in Bitcoin. I'm personally quite absorbed with the project, and am hoping that by writing about it incessantly, I might be able to convince you to at least appreciate the project for its vast complexity, if not for the riches it might make you, if only you invest at the right time.

I'd like to spend some time today writing out everything I know about a small corner piece of the Bitcoin puzzle, a transaction replacement protocol colloquially termed "Replace By Fee", or RBF for short.

A short description of the problem space

In order for a Bitcoin transaction to be considered valid, you must first have it included in a block by a miner. Normally, the way that would happen is as follows:

  1. You compose and sign a valid Bitcoin transaction. I'm leaving the details out here, but think of it like a HTTP packet that is ready to be sent out across the network, if that's helpful.
  2. You broadcast your transaction out from your wallet, onto the Bitcoin network.
  3. Other Bitcoin nodes on the network see your transaction and add it to their 'mempool'. This is the set of all Bitcoin transactions that have not yet been included in a block. They are candidates for inclusion.
  4. A miner receives your block. The miner finds a winning hash that makes its block a block. Your block is now mined.
  5. The newly mined block is transmit from the miner's computer to all the other computers on the Bitcoin network.
  6. Upon receiving this block, the Bitcoin node evicts all of the now-mined transactions from its mempool.
  7. Rejoice. Your Bitcoin is Spent!

You may remember that the topic we're discussing today is known as 'Replace By Fee'. When, you might ask, in this sequence of events might you want to replace your Bitcoin transaction?

The answer is sometime between steps 3 and 4 above. After you've broadcast your transaction, there is a chance that it will be seen and mined by a miner. Once your transaction has been mined, you can no longer broadcast a new version of that transaction, as the inputs to it have now been marked as spent.

There are a few cases, however, where your transaction might get trapped or evicted from the mempool without being included in a block. One common case for when this might happen is when the number of transactions that are looking to be included in a block (ie the mempool size) is larger than the available blocksize. In this case, transactions tend to be processed or mined based on the feerate per kilobyte that they offer to pay the miner for their inclusion.

If you've broadcast a transaction with a low feerate, and suddenly the mempool fills up with a lot of transactions that are looking to be included in a block, you may want to update your transaction to provide a higher feerate, so that your transaction will be confirmed in the next available block.

There's currently two mechanisms that people use to try to get their transaction included. The first is what we'll be talking about more in depth here, Replace By Fee. The basic gist of Replace by Fee is that you're rebroadcasting a previously broadcast transaction, but with a greater fee paid than the prior transaction.

The other strategy that wallets use to get transactions included in full blocks is called Child Pays For Parent, or CPFP for short. It involves issuing a new transaction, one that spends the earlier, still unconfirmed transaction. This second, child transaction will pay a larger feerate than it might on its own, with the hope that the now pair of transactions' total feerate will be high enough to merit inclusion in the next block. CPFP only works if the transaction you broadcast has an output that you can spend.

RBF: The Existing Algorithm

Replacing By Fee replaces the earlier transaction that you broadcast in other node's mempools. That's where the replacing happens. There is a set of rules governing whether or not a transaction is eligible for being evicted from the mempool and replaced by a new one. Here's a few things that the 'accept into mempool' code checks...

  • The transaction that you're attempting to replace has flagged itself as eligible for replacement. This is flagged at a transaction level, but is retroactive for any as yet unmined inputs that you're spending. If any of a transaction's inputs, or it's input's inputs, are flagged as replaceable, then this current transaction is also considered eligible for replacement. If a transaction or any of its parent inputs are not marked as replaceable, any transaction with an input conflict (that is they'd be spending the same inputs) is rejected with the error "txn-mempool-conflict".

  • Requires that all inputs already exist in the UTXO set. No currently unmined inputs are allowed in a replacement transaction. This is a tighter rule than the desired one, which is to check that the replacement doesn't require 'low fee junk' to be mined first. You can avoid this by rejecting any replacements that aren't using already mined inputs.

  • A replacement candidate must pay more in fees than all the transactions it replaces. The rationale for this is that sending transactions across the network consumes bandwidth. The higher feereate of the new transaction, in theory, pays for its increased usage of bandwidth: once for the original broadcast and then again for every subsequent replacement. Note that the nodes keeping and broadcasting this transaction don't get paid -- only miners do. In that sense the fee is more of a social justice than a net payment to every node that sees the transaction.

Note that this is in total fees, not fee rate. Any replacement transaction must pay more in total fees than the entirety of any and all transactions that the replacement would displace from the mempool. There's the potential that you'll be replacing an entire "package" of umined transactions, a parent-child chain of transactions that are looking to be mined. If you're a small transaction and you're trying to replace another who has an extremely large sized child also in the mempool, your effective fee rate (roughly calculated as the fee paid per byte of transaction that is included in the block) will be much higher than the original as you need to cover a larger amount of fees with a smaller number of bytes.

  • Finally, if the 'package' of transactions that you're looking to replace numbers greater than 100, your transaction replacement won't be added to the mempool. In other words, if someone has attached 99 transactions onto the transaction you'd like to RBF, you're shit out of luck. You'll have to wait until there's enough room in a block for your original to be mined.

Proposed Changes

Russell O'Connor published a proposal to change how the RBF rules work, at least two of them. The proposal would update the total fees rule. Instead of a replacement needing to beat the absolute fee amount of all transactions that it would be replacing (aka the "package" of transactions), it'd only need to be beat the effective feerate of the original. Additionally, the proposal would amend the 4th rule, such that the fee on your replacement is at least as much as the minrelayfee on the total package you're looking to displace from the mempool.[1]

Why is minrelayfee used as a minimum? A transaction that's replacing a larger set of transactions removes already transmitted bytes from the mempool. This rule change makes sure that the replacement transaction 'pays' for the cost of relaying those removed bytes.

Ok this is all pretty tedious. Let's take a look at some examples.

Miner Incentives, A Consideration

There's two cases that we should consider: a larger transaction wants to replace a smaller transaction (small txn -> larger txn) and that of a smaller transaction replacing a larger set of transactions, or package (large package -> small txn).

Current Rules

small txn -> large txn: Rule 3 stipulates that the total fees must be greater, with no regard to fee rate. In practice, no replacement is accepted if it lowers the total feerate of the mempool. (source). In practice, this shouldn't happen anyway. The motivation for RBF'ing a transaction is that the block inclusion feerate cutoff has spiked -- replacing one transaction with another larger one with a lower fee rate makes it less, not more, likely that your transaction will get mined in the next block.

large package -> small txn: The smaller transaction must pay more total fees than the existing package. The miner doubly wins: they're making the fees of a large transaction in a smaller byte footprint.

Proposed Rules

small txn -> large txn: Miner's choice strictly improves. The fee rate per byte that they're including has increased and the net fee of the new, larger replacement transaction is greater. This is no change from the current scheme.

large package -> small txn: Miner's choice also improves. Although the total fee that they will make for mining the smaller replacement transaction is net-net smaller than the fees the entire large package would have earned them, given a competitive environment for blockspace (ostensibly why the RBF was triggered in the first place), the smaller transaction with the higher per byte fee rate is more likely to be mined than the larger, lower fee per byte package it's replacing. The incentives of the miner (highest fee per block byte) and the RBF'er (having the transaction confirmed for the lowest reasonable fee) align.

Wherein We Contemplate a Word Problem

Let's take a closer look at the large package -> small txn case, as that's clearly the one where the proposed rule change has the greates impact.

A 1ksipa size transaction with a 10ksipa sized child transaction is in the mempool. The current feerate on the block is 2 satoshis / sipa[2]. The total fees that these two transactions, or package, pay is 2ksat + 20ksats = 22ksats.

Under the current scheme, a replacement transaction of size 1ksipa would need to pay at least 23k satohis, a feerate of 23 satoshis / sipa. This is an 11.5x increase in feerate from the original package's rate of 2 satoshis / sipa.

Under the proposed scheme, a replacement transaction of size 1ksipa would need to pay 12k satoshi in fees in order to replace a set of transactions of size 11ksipa. The effective feerate on the replacement transaction is 12 satoshis / sipa, a 6x increase in feerate above the package it's replacing.

The proposed ruleset strictly improves the feerate of the mempool, while lowering the fee ceiling for replacing a large or weighty transaction.

Notus Commentarius

RBF mechanics closely resemble that of an auction, where the rules for replacement are actually the next price that the auctioneer will accept a bid at. The current rules set the floor for the next bid to be extortionately high if the number of bytes you're looking to replace is quite large. Russell's proposed rule change lowers the bid floor to a more reasonable metric.

One of the largest arguments against changing the replacement fee rules, as far as I can tell, hinges on the argument that without a fee hike, anyone could spam the network with RBF requests, creating mempool churn and eating up network bandwidth. I'd argue that any RBF mechanism leaves an opening for this style of DoS attack on a node. The difference between these two proposals is not the mechanism, but merely the floor cost for waging such an attack -- at some point your transaction will be mined and the fees you've offered up will be paid. Further, the only case where this attack would be truly expensive is in the case where they're looking to replace a large number of bytes in the mempool -- perhaps that truly is the most likely DoS attack vector, however.

Thanks for sticking with me! Hope you enjoyed learning more about how mempool transaction replacement works! I left a few things off, but the main gist of how RBF works is all here.

[1] Russell O'Connor's proposed RBF rule changes (source Bitcoin ML) vs BIP125, the current RBF rules.
[2] A sipa is a byte/weight measurement. For simplicity's sake you can consider a sipa to be a byte.

#rbf #bitcoin #explainers
26 Dec 2018 c.e.
Blockchains Against Evil, Impressions

Takeaways from a blockchain ethics conference I attended earlier this month, Blockchains Against Evil

I attended a day-long conference/seminar earlier this month, that pulled together a bunch of people in the 'blockchain' space to talk about trends in the industry, especially around security and lawlessness.

The Event, Specifics

The event itself was held in a rented conference space off Divisidero, in San Francisco. There were about 30 people in attendance, if I had to guess. Most everyone who attended worked or invested in the 'blockchain' space. There was a good mix of job types and roles: programmers, investors, company-runners, cypherpunks, non-profit directors, etc. I knew a few people from the Internet, but most were new faces.

The day was split up into a bunch of round-table talks. I honestly can't remember most of the themes. I took notes, but I've since misplaced the notebook. I'm planning to write up a longer piece on the insights the discussions gave me that specifically related to privacy and secrecy and how cryptography and the state interplay in this, but that piece is far more ambitious that I have the time or inclination to reason through now. Much like my lost notebook -- it'll be dug out later.

Themes and Thematics

Instead, I'll leave you with a short overview of the most salient points that were discussed. Most of these are a paraphrasing of other's points and ideas. I take credit for only the spotty transcription.

  • Crypto has provided a secure mechanism for ransomware makers to get paid. The global nature of the web plus Bitcoin's ubiquitous reach[1] mean that ransomware is truly a viable attack for anyone who's got access to a Bitcoin wallet. This is all of you. Another lens to put on this one is that it's put a premium on securing networks of valuable data. If your data being inaccessible makes your work impossible, it's likely only a matter of time until you're a target for a ransomware play.

  • While ransomware has placed a bounty on your databases, Bitcoin and other Proof of Work currencies have placed a directly calculable value on a computer's CPU cycles. Previous hacking rings have focused on skimming credit card numbers[2]; the past decade has seen more and more viruses that aim to steal compute power rather than credit cards or identities. That's because they can make money by stealing computation cycles and your power to mine crypto. I'd be curious to see stats on how the rise of ASICs has affected the profitability of botnet miners. Bonus points for an analysis that includes the impact of the recent price drop on said profitability.

  • Personal security is hard to measure. There've been several high profile cryptocurrency and 'blockchain' project attacks recently that involved getting a phone company to port a target's telephone number to a new SIM card, giving the attackers access to their SMS two-factor authentication backup codes. The general advice for avoiding this sort of problem is to ask your phone company not to port your number without being provided with a secondary PIN number or the like; others at the conference had switched to Project Fi, Google's phone service, for the express reason that they don't have a customer support telephone number. (Personally, I already use Project Fi). More generally speaking, there seemed to be generally an interest in hiring a hacker to do a personal security audit. If you or someone you know runs this kind of a service, let me know. I'd love to hear more about what kind of people you work with and what your price point is for an individual investigation.

  • Demand for decentralized services historically has been rather complex, if not a bit on the weak side. Often, they crop up as alternatives to more centralized services when a core user group is pushed off of the more centralized services (i.e. music and film piracy, right-wing punditry, and most recently sex work with SESTA/FOSTA[3]). As difficult as it is, it's pretty wild to imagine existing in a fully decentralized world, one where no one has the power to deplatform anyone else. It's hard to imagine a world where everyone runs their own decentralized server, a la the Urbit dream. Curation and searchability seem like they'd be particularly high value services in this kind of world. It definitely would be heading into 'pure free speech' territory, of the likes we only dream of currently but also remember folks that while speech may be free, slander is still illegal.

  • Personal anonymity. What right do you have to decide who and what can see where your money is going? I've got a lot of unfinished thoughts on this that I'm hoping to put up later in a separate piece. If and when I do, I'll update this to link to it.

  • Closely related to that, do anonymous payment networks breed demand for dark market goods? I'm talking about child pornography and buying hitmen for untraceable cash. I think the recent Epstein revelations[4] points towards no, vice isn't necessarily driven by access to invisible money. Honestly, if anything it's moving illicitness from the cash economy to the digital economy. Cash is largely untraceable. If you lose it in a fire, it's gone. In some ways, this is oddly similar to problems with keeping private keys and wallets safe for digital cash. But I digress. To what extent has a traceable money supply kept people exercising base desires that a lack of traceability now enables? Again, I think this is smaller than we suspect, but maybe I'm wrong. If anything, I think dark money and dark Internet (Tor) have made buying illegal drugs and child pornography much easier than they were in the past, but does ease of use drive volume? These things are still illegal. I'd love to read a study on the impact of digital darkness on illicit good trade, though I imagine hard numbers on this are hard to come by.

In Exitus

Digital money has created huge new opportunities for criminals and privacy lovers alike. I feel like the cat's largely out of the bag with the existence of digital money systems such as Bitcoin and Zcash (and Grin soon!). I'd love to see personal and institutional privacy and security become both more widely understood and practiced -- though at its core this problem involves an even greater investment into even basic computational understanding.

Will we, as a society, be able to educate ourselves fast enough to protect our systems and selves against the rising tide of spying nation states and exploitative hackers? I guess we'll find out.

I really enjoyed spending a day hearing about the in's and out's of blockchain ethics. I'm really grateful that there's people in SF who want to have these conversations, and went so far as to organize a space where we could discuss them. Huge <3 to all the organizers and other attendees that made the day incredibly worthwhile.

[1] By Bitcoin I really mean any value-acknowledged cryptocurrency.
[2] See the story of The Iceman
[3] A lot of this discussion hinged on the stuff John Backus has been digging up lately, I really like his article on Music Piracy
[4] The man basically ran a prostitution ring for wealthy and well-connected men, from a cadre of underage women that he developed. Miami Herald has the story.

#blockchains #conference-swag #impressionism
28 Nov 2018 c.e.
Getting AMPed Up or Reflections on Lightning post Adelaide

I've recently been thrust head first into my first open source software ecosystem. I love it; I also feel like I'm struggling to contribute anything worthwhile because I've been spending so much time just getting up to speed -- the particular subsystem of software that I've landed in is incredibly complex and has a bit of scattershot documentation, spread across a couple of mailing lists and two enormous projects.

I want to give some meta commentary on the mechanics of getting involved in a new, active space, and then give a more nuts and bolts overview of the considerations that are shaping the edge of Lightning at the moment. I'm sure I've left things out, so know that my list is just a subset of all the things.

Finding Active Edges

There's a difference between getting up to speed and active in a currently evolving field versus learning a topic or subfield that's pretty much static. By way of example, I'd largely consider calculus and functional programming, as fields, to be pretty static, i.e. there's interesting stuff happening at the margins, probably, but there's not a lot of paradigm shifting research going into how to describe functionalism or what a second derivative is. As a field and practice, the borders of meaning and scope have largely been well defined.

'Active' spaces are different. They have action, or people actively working on new approaches or building out software and new ideas. The presence of people and the messiness of definition and conversation are beacons to what the interesting and new things the future will hold.

Arriving at an edge or beehive of activity where there are people working is like descending into a bit of chaos. In an active field, there's usually a lot of independent research and motivations and interests that keep the actors on this edge a bit spread out. Figuring out where the edges lie is difficult because the definition of the edge is its lack of a roadmap. Sometimes you can find artifacts that strictly define at least a subset of those edges -- the wiki tracking decisions made at the Lightning Summit in Adelaide two weeks ago is one such example.

I was lucky with Lightning, in a lot of ways. The biggest one is that due to the team I joined, I have a lot of direct access to people that have been working on the edge of the space basically since the beginning (h/t to cdecker). The other is that I joined just in time to attend the latest spec update meeting. These meetings are rare -- the last one happened over two years ago in Milan for the first lightning spec.

I'm not going to talk directly about what happened at the meeting; if you're interested check out the lightning mailing list, where we're currently in the process of hashing out the decisions made at the summit (which you can see here), or take a look at the PRs currently in progress on the lighting-rfc Github project.

Rather, I'd like to give some really meta impressions of what kind of thinking it takes to get involved in a project like Lightning -- hopefully this metaness will give you a portrait of what kind of conversations you need to be having or questions you should be looking to get answered when getting involved in a new field.

First off, it's hard to contribute to a field if you don't really understand the underlying system that it's operating on top of. Sure, this is easy enough to say, but just figuring out the contours of the system that define the problem space can be tricky. A lot of the stickiest problems that Lightning developers deal with, especially when looking to expand the protocol or improve the experience, are either limitations in the underlying Bitcoin protocol or a self-imposed mandate for privacy. If you don't have a good grasp on the goals of Lightning with regard to privacy (keep it, as much as possible), or a pretty deep knowledge of how Bitcoin itself works, you're not going to be able to contribute much to the conversation around Lightning -- mainly because you're going to struggle to even understand, let alone communicate with, people who are already working in the space.

I'm an incredibly quick study, but still relatively new to the Bitcoin and Lightning space. My largest contributions to date can mostly be summed up as asking clarifying questions. This may seem trivial, but I've come to see that it's an important contribution nonetheless -- comprehensibility is an incredibly important aspect of a system that needs and wants newcomers to both feel welcome to the space and able to contribute. And Lightning definitely could be more comprehensible!

Into the Deep

With an eye to making the Lightning space a bit less opaque, I'd like to run through a few of the higher level considerations that seemed to come up with some frequency during the weeks leading up to and at the summit itself. I think it's safe to say that these themes will be continuing problems and on-going discussions in the Lightning ecosystem.

Bitcoin

Bitcoin protocol limitations come in a variety of flavors. Here's a quick, condensed (and definitely contains omissions) rundown of things in Bitcoin that hold up or complicate Lightning feature development:

  • Fees. Lightning is a 'second layer' protocol, sure, but at some point it has to publish transactions on the Bitcoin blockchain. Lightning's security mechanisms (ie your ability to successfully pull your money out of a channel) rely on the ability to get a transaction into a block within a reasonable amount of time. Lately, this hasn't been a problem, but if and when fees spike, there's a lot of potential to run into trouble if your transactions aren't able to get confirmed. Fees are complicated by the fact that 1) there's two parties involved in creating and spending all the transactions, 2) commitment transactions are usually composed, signed and stored long before you might actually need them, 3) economic incentives mean that you're probably looking to pay the smallest fee possible to accomplish what you want, but this means that you're probably in a bad position in terms of being able to get your transactions on chain in a fee spike event. Lightning as a protocol would like to move away from the business of needing to know what the fees should be, but that means we're going to run into another corner case of the Bitcoin transaction ecosystem...

  • RBF and CPFP. If you're not deep in the Bitcoin wallet management weeds, there's a good chance you've never heard these acronyms before. Briefly speaking, these are two mechanisms that the Bitcoin protocol provides for getting a transaction through that has largely been pushed to the back of the queue for being included in a block (mines/confirmed etc) because of a fee spike. RBF stands for Replace By Fee, whereby you basically re-issue a new copy of a transaction, but one with more fees per sipa[1]. CPFP means Child Pays For Parent. It takes advantage of the chained nature of Bitcoin transactions, and attempts to 'sweeten the deal' for miners such that they'll mine your first, low fee transaction in order to also be able to mine a high fee child transaction. The parent plus child chain is typically termed a 'package'.

  • Schnorr. What is Schnorr? Schnorr is a proposed change to multiparty signature composition. Including it in Bitcoin will require a revision of the signature verification mechanisms.In addition to more compact and easier to verify signatures, Schnorr unlocks a certain amount of obfuscation and script burying. Schnorr can make Lightning channel openings invisible on chain (right now they're a bit easy to spot[2]). There's a few other nice things that Schnorr signatures enable, that I don't exactly remember the details of, but they'll Lightning to send payments in parts more easily and securely[3]
  • Script Sighash Flags. Christian Decker's been spearheading an effort to update the way that Lightning balances are enforceable on chain. (The updated protocol is called Eltoo, you can read more about it in this high level article I wrote, or the paper itself, if you want something a bit more in depth.) This requires a change to Bitcoin script, specifically the addition of a new sighash flag called SIGHASH_NOINPUT[4][5]. Work on the new, improved state management protocol is basically stalled until this gets merged into the Bitcoin reference implementation. On another note, there's some other boutique, existing sighash flags that will probably start being utilized by Lightning transactions as part of the attempts to dodge the fee problem. Watch this space.
  • Transaction malleability. This is an ancient problem now in Lightning land, as it was resolved when SegWit landed. If you're going to be doing Lightning, you should know how SegWit works, as that's the only type of transaction protocol that Lightning wallets speak. As a historical note, transaction malleability basically refers to how fixed the transaction hash is. Lightning, in its current form, requires the guarantee that the hash of a signed transaction can't be changed (by a miner or the other party etc). SegWit fixed this -- it's practically never mentioned now. In other words, this problem has moved off the edge, largely because it's settled.

Privacy

This feels like one that's taken for granted more than most things, but it largely informs a lot of architectural decisions that get made. Maintaining privacy is important, and it manifests itself in a bunch of ways. Here's a short list of things that privacy considerations impact.

  • Error handling. How do you know who bungled your payment?
  • Payment correlation / decorrelation. Can an observer figure out if payments being sent over different channels or the same payment over different time periods, routes, are the same?
  • Getting a clear picture of current network health. It's hard to a payment success rate if the payments themselves are localized and unreadable
  • Autopilots. How much information should nodes reveal, to help other nodes figure out who to connect to?
  • Anything that might leak private or proprietary information including but not limited to: channel balances, node wallet UTXOs, payment origination, payment destination

Other assorted things

  • Liveness. Payments can get stuck if nodes along the route aren't responding. This is particularly bad if a payment has to 'go to chain', ie be finalized via the blockchain.
  • Liquidity. Lightning payment capacity is a constantly mutating DAG. Channels' total value is known, but the balance of funds within that channel is often kept secret (see Privacy, above). This makes it hard to predict which routes will fail until you try it -- the advertised channel capacity may be pointing in the wrong direction. This is exacerbated by the fact that channel funding is one-sided at the moment. Splicing and dual-funding will help this problem.
  • How important are receipts? This deserves a much longer post and honestly I need to do more research around it; I won't get into it here.

In Exitus

I'm having a great time.

[1] A sipa is another term for a kiloweight, which is a Bitcoinic way of weighting bytes in a transaction to calculate the fee rate of a transaction. As a general rule, miners prefer transactions with the highest fee rate per byte. If a fee rate spike is happening, you're going to want to up your transaction's effective rate.
[2] As an aside, we green lighted work on a different signature scheme (some 2 party single ECDSA sig algorithm) that can let private channels remain invisible on chain. Nice because it doesn't rely on Schnorr.
[3] There's been a lot of discussion around AMP (base AMP, OG AMP, low versus high AMP). This deserves a longer discussion, but know that Schnorr sigs will provide a way to do split-payments with fewer drawbacks than any of the current proposals. In fact the coming of Schnorr is a background vibe underpinning a lot of the discussion, as it makes the timeline question more important.
[4] I believe the final name is settling somewhere near SIGHASH_NOINPUT_UNSAFE for #reasons.
[5] What's a sighash flag you ask? Briefly, it's a bit that's added to a transaction signature that tells the verifier what fields in the transaction that the signature signed. You can read more about them here.

#lightning #bitcoin #oss #edges
22 Nov 2018 c.e.
A Brief Love Letter to XOR

I'm taking an online crypto class[1] right now, and it's been forcing me to get more intimate with the bitwise operator XOR. On top of being incredibly lightweight, there's a few really cool things that XOR can do.

In the spirit of the Thanksgiving season, here's a brief love letter to my favorite little boolean operator, XOR.

What is XOR?

XOR stands for 'eXclusive OR', where 'or' refers to the boolean logic operation. What does that mean, a boolean logic operation? Briefly, it's what conclusion you draw from two truth values. It's kind of like a predetermined agreement mechanism. Boolean logic is a rule that you apply to two results, to resolve those two results to a single true or false.

A simple example is probably helpful. Let's say that we've got two voters, and we're trying to take their two votes (either YES or NO) and return a single decision for the 'election'. How these two imaginary voter's votes are counted is the role of the Boolean logic operator.

There's two, fairly common boolean operations that you might have heard of before: and & or. The decision for 'and' is fairly intuitive: if both voters vote YES, then the result is YES. Otherwise, the result is NO. We'll only get a final YES vote if both of the people we're asking say YES. If either voter votes NO, the final result from the boolean operation will be NO. The 'and' decision framework requires 100% agreement.

'Or', on the other hand, says that if either voter says YES, then we'll take the result to be YES. The 'or' decision framework requires only one single 'voter' to say YES in order to return a YES.

So what is XOR? XOR only returns true if the voters disagree. If both voters say YES, 'xor' will return NO. Same thing if both voters say NO: 'xor' will still return a NO. It's only when one 'voter' has chosen YES and the other NO that XOR resolves to a YES. It doesn't matter which voter says YES and which one NO, as long as the voters disagree XOR returns YES.

Why is it called exclusive or? Great question. I have no idea, but you can probably find out on the Internet.

XOR As Your Encryption Friend

XOR does some pretty fancy things. If you take a series of bits and XOR it together with another series of bits, the original series of bits can be retrieved out of the resulting string, but only if you know what the second series of bits was. It's almost impossible to tell what the original bit series was. Here's a quick example, to show you what I mean.

// If I take the bit series 0101 and XOR it with 1010  
0101 xor 1010 =  1111

A result of 1111 doesn't tell you what bits belong in which of the strings that you xor'd together. You could have xor'd 1111 with 0000. Or 1100 with 0011. But! If you do happen to know one of inputs, you can easily extract the other.

// If I know 1010 and the result, 1111, I can extract the other input   
1010 xor 1111 = 0101

This is incredibly useful in cryptography. If you take a message and XOR it with a 'secret key' (a random series of bits) as the same size as your message, viola, your message is now encrypted. If your 'secret key' is a random enough series of bits, then it will be practically impossible for anyone to know what the original message bits were. To decrypt this message, all you need is the encrypted message and the key that was used to encrypt it.[2]

// How to encrypt a message   
message xor key = encrypted_message  

// How to decrypt a message  
encrypted_message xor key = message

Other XOR Magic

XOR has a little bit of 'magic' that happens when you use either a set of all 0's or all 1's to XOR against.

You can 'bitflip' any series of bits by XOR'ing it with a series of 1's.

// Flip a bit set!
111000 xor 111111 = 000111

XOR'ing by a set of 0's is an 'identity fucntion' -- it'll return the same series of bits as what you originally XOR'd in. It's probably not a good idea to use a set of 0's as your encryption key -- it'd be like putting your message behind a piece of glass. XOR'ing by 0 is transparent!

// Show me the same!
111000 xor 000000 = 111000

In Exitus

The next time you use encryption to send a message with a friend over the Internet, give a little thanks for your crypto workhorse bestie, XOR.

[1] Dan Boneh's Crypto I on Coursera
[2] This method of encryption is generally called the One Time Pad encryption, as the key is as long as the message. So long as you never reuse the same key on a different message and your key is a random stream of bits, this method of encryption (xor'ing the message with a key) has what's known as perfect secrecy. The biggest, practical problem with this method of encryption is that the person decrypting your message needs to know the key. You'd need a secure way to send them the key, as anyone who gets the key can then decrypt the message. The key is as long as the message though! If you have access to a secure method of communication that can transmit something as long as the message, you should just send the message itself over that secure communication channel. It's just as long, and your chatting partner won't have to decrypt it. This equal length key problem is why they say that perfect secrecy is practically impractical.

#xor #boolean #love-letter
20 Nov 2018 c.e.
iOS First Impressions

A long time Android user, I made the switch over to my first iOS phone this week. I've never used any Apple phone before, in any true capacity, despite knowing a good number of iOS devs. I'm excited to finally see their work. Here's a few of my first impressions on the platform!

The Gestures are Intuitive

I've watched other people swipe their way through iOS interfaces and wasn't really all that confident that I'd be able to figure it out. Surprisingly, it didn't take me all that long to get to a point to where I could get them working. I did need someone to show me how to get to the notifications screen though -- I kept landing on the screen with all the widgets instead. Otherwise, they're pretty great. I especially love how the camera and flashlight buttons on the lockscreen feel like actual buttons.

Getting Back is Hard

Sometimes I end up back on the 'home' screen and it sucks. Luckily, apps seem to have a really good memory of where you left off, so tapping into them from the home screen is super intuitive. Unlike Android, where tapping the home icon has fairly unpredictable behavior, based on how they programmed the original launch intent to work. Flexibility is nice, but this is one place where having a predictable user experience is really reassuring.

Buttery Smooth

Everything animates so smoothly. It's incredible. The way chat bubbles slide around on the page. The smooth swiping motion I can make in the Twitter app and get back to the previous page. I can't get over how great it is, how pervasive. Everything moves in beautiful ways. This phone is an absolute delight to interact with.

Moving All My Settings Over from Android

I tried to use the Move to iOS app to get all of my accounts and things moved over from my Android phone, but couldn't get the bluetooth pairing to work. It suspect there was something wrong with my Android phone, as I also had trouble when trying to pair it with my Garmin running watch. I eventually ran out of patience and went ahead and set up the phone without it, only to realize later that there's no way to come back and make it work without wiping the phone entirely. Luckily, most of the Google apps transfer over pretty cleanly. That's been nice!

The one biggest exception would be the Signal app. Switching cellphones changed my safety number, so now I can't use Signal on the Android phone as well. I also had to re-link my desktop app since I switched phones. I really thought it'd work as a secondary device that I could just add to my account, but it seems that the whole ecosystem is pretty strongly tied to a concept of there being a Single, Blessed install of the Signal phone app. Kind of a bummer for wanting to be able to switch between phones on the reg, as your messages don't get propagated between devices (and you'd have to reregister every time you make the switch). I don't think I'll be switching that often, but it is a bit of a bummer nonetheless.

Switching SIM Cards

I use Android, which also means that I use Project Fi. I spent a decent amount of time and effort researching alternative ways to use a different phone provider but T-Mobile was hellishly expensive (the iPhone is locked to T-Mobile) and there wasn't a clear cut solution for what I really wanted to do (have one number ring two phones). I really want to be able to keep my phone number the same, so that I'm easy to reach by anyone, anywhere, but Project Fi isn't supposed to work with Apple phones. Turns out that it does work, somewhat. I hear there's limitations (it only uses the T-Mobile network, none of the international data works), but since I'm not planning to get rid of my Android phone anytime soon, I should be able to switch back without too many problems.

How to Share Things

I'm still pretty confused with what that arrow out of a box even means. I hate it. It's ugly. I don't like it. Someone make it go away.

The Notch and Other Unaesthetic Things

The title for this section is a lie. There is only one unaesthetic thing that I've observed so far about the iOS XS that I've got, and that's the notch. It's terrible and you're lying to yourself if you think otherwise. I can smell the Stockholm Syndrome from here.

Discovering Which of Your Friends Are Discriminating Assholes

"Hey you're blue now! Whoohoo". Fuck you. Fuck all of you.

That color discrimination runs deeper than you think, man. The last company I worked at the full time employees had blue badges. The contractors's badges? Green.

You're Not Getting My Face

Or my finger prints. This is platform independent, but it does suck. I'm pretty anti-dead man switches in general, as in anything that lets you into my phone when I'm dead or otherwise incapacitated is generally off limits. I hate how sexy smooth the login experience looks though. I also resent how they only switched to face detection (and away from the equally problematic fingerprint scan) because they needed more screen space.

They got rid of the fingerprint scanner but they couldn't get rid of the notch. Terrible.

In Exitus

I'm incredibly impressed at how easy it's been to switch over, even without the Move to iOS app working as intended. In a lot of ways, this is because Google has made so many of their apps available for iOS! Thanks Google.

All in all, I'm a little embarrassed at how long it's taken me to give iOS a try. I really love it. I feel a bit bad for how quickly I've come around to liking it, given how staunch and how deeply entrenched of an Android user I've been. I've always known that the design practice at Google left a lot to be desired, but seeing and experiencing an iOS machine in practice has really been eye opening to how many misses Google made at some really serious decision junctures.

Or maybe Apple just patented all of it. Assholes.

#iOS #first #impressions #android
29 Oct 2018 c.e.
Understanding Eltoo

Simplified Channels, Simply

This article assumes base knowledge of the existing Lightning Network contracts and Bitcoin transaction composition. This is a lot of base understanding to have, and in fact, I'd argue that it's probably the biggest challenge to fully understanding what eltoo is really getting at.

That being said, I'll do what I can to explain it such that it's understandable.

Let's start by first understanding how the existing Lightning network contract invalidation system works.

The original Lightning protocol relies on a series of half-signed transactions. When the channel balance needs to be updated, you exchange a new set of half-signed transactions that update your balance. In order to keep your channel partner from broadcasting an old, invalid transaction that you've signed, every time that you exchange a new, updated transaction that reflects the current state of the payment balances, you also exchange a 'penalty' transaction, of sorts, that allows you to claim all of the Bitcoin in the channel, if the other person in the channel accidentally or intentionally publishes an old transaction state.

Each of these exchanged transactions spends the same output -- the one created by the Funding transaction.

It'd probably be useful to spend a bit of time here talking about how Bitcoin transactions work, as it'll be handy when we get into eltoo. Every Bitcoin transaction is a global state update. It takes existing, unspent output objects, spends them by providing a signature that proves you can spend them, and creates new unspent output objects. The set of previous outputs that your Bitcoin transaction "uses up" are called inputs. Every input is another, previous transaction's outputs.

Let's bring this back to the Funding transaction then. A funding transaction has a single output. This output can only be spent by providing two signatures, one from each party in the channel. This is called a 2-of-2 multisig transaction. The funding transaction is committed to the Bitcoin blockchain, which then makes this Funding output eligible to be spent by the channel parties. The only way these funds can be spent is if both parties sign a transaction. The channel balance is updated, then, by creating new, ephemeral transactions that spend this output, re-apportioning the total value in the channel to each party as a reflection of their current balance.

As a concrete example, let's say you and I wanted to create a Lightning channel between ourselves. I'm going to offer up 2 Bitcoin, you're putting in 1 Bitcoin. We'd make a funding transaction that takes two inputs: my 2 bitcoin and your 1 bitcoin, and creates one output of 3 Bitcoin. This 3 Bitcoin can only be spent by a transaction that has both of our signatures on it.

To record what the original balance is, we'd create a transaction then that has, as an input, the 3 Bitcoin funding transaction result, and that pays out two outputs: one to me for 2 Bitcoin and one to you for 1 Bitcoin. It's a lot more complicated than this, but for the sake of understanding how eltoo works, this simplification will suffice. The whole point of Lightning is that you and I can now Do Business between each other. I buy you lunch, worth 0.25 BTC. Rather than paying me back with Square Cash or Venmo, we could just create a new transaction that spends the funding transaction, throwing away or invalidating the first one that we exchanged. The new, updated transaction would reflect the new balance of accounts between us: it'd pay me out 2.25 BTC and you'd get .75 BTC.

The problem is that the first transaction we exchanged, with the original balance values, still exists. In it, I get just 2 BTC and you still get your original 1 BTC. Let's say that you decide you'd like to stiff me the lunch I bought you (jerk), so you publish the original transaction onto the blockchain. I wouldn't be able to publish the later transaction that actually reflects the balance between us, because I'd also be trying to spend the same output from the Funding transaction. You can't do this -- only one Bitcoin transaction can spend a single output. So I'd be shit out of luck.

The original Lightning proposal solves this problem of past state transactions getting published by introducing a concept of penalties. Without going into too much detail, it effectively gives me the ability to penalize you for reneging on our lunch deal, by spending a special output from the stale state transaction that you published. The penalty for your actions is built into the outputs of the state transaction, and basically gives me the ability to take all of the Bitcoin in the channel for myself. So the punishment for trying to take back your .25BTC is a total loss of all of your channel Bitcoin.

Crime doesn't pay, at least not with lightning.

Every time we want to update the balance of accounts in the channel though, we need to exchange new penalty transactions, that invalidate the previous state. Or not really invalidate it, but provide a huge disincentive for you, if you decide to publish it anyway. Half of the security of the system, then, relies on you and me saving all of the penalty transactions that we exchange, because they map 1 to 1 to expired or invalid transactions. If you lose the penalty transaction for a particular old, invalidated state, and for some reason the other party finds out that you've lost it, they can publish that old transaction and you wouldn't be able to do anything other than accept your loss.

In in that way, Lightning, as it exists today, largely resembles a practical implementation of the theory of guaranteed mutual destruction. If you lose or reveal your nuclear arsenal (in this case, a set of transactions that invoke penalties for unfair actions on behalf of your adversary aka channel partner), you're shit out of luck.

As any Soviet-era super power knows, nuclear arsenals are costly to maintain. Here's where eltoo comes in. eltoo is an elegant proposal to do away with private-arsenals of penalty transactions. In fact, eltoo does away with penalties entirely. Rather, it provides an elegant mechanism for allowing any later agreed upon state to override any previously agreed upon state. As long as you have a copy of the most recently agreed upon transaction, you can publish it any time and it's guaranteed to be spendable.

The key to this is being able to decouple a transaction from any specific output. The original Lightning transaction scheme relied on all transactions spending from the Funding transaction. Instead of pegging a signed transaction explicitly to a prior one, signed transactions can spend any transaction for which it has a valid spend script. These types of transactions are called floating transactions.

This is huge, because it gives you the ability to 'fix' the channel balance. Previously, if your channel partner published a stale balance transaction, you couldn't do anything to 'fix' it because all of the existing state transactions that you have spend the same output: the funding transaction output. With a floating transaction, however, you have the flexibility to spend either from the funding transaction, or from any previous state transaction.

Let's go back to the previous example. I've bought you lunch, the most current transaction that we share between us says that I get 2.25 Bitcoin, you get .75 Bitcoin. You decide to publish the older transaction, where you get 1 Bitcoin and I get 2. The transaction spends the funding transaction output. In the original Lightning scheme, I can't fix this because I all of the transactions that I have must spend from the funding transaction, and the funding transaction's single output has now been spent, by you. With an eltoo floating transaction, I can broadcast the most up-to-date transaction, spending from not from the funding transaction output, but from the old state transaction that you've just published. This flexibility as to which transaction you're spending is what makes eltoo so elegant. There's no need for punishment, because old state updates are fixable, you just spend them again, with the most up-to-date balance, ideally.

How does eltoo do this? By removing the concrete identifier of an input to a transaction. To do this, you'd construct a transaction using a the flat SIGN_NOINPUT, which means that the signature doesn't include the unique identifier of the input it's spending. Currently, this isn't a part of the Bitcoin spec; there's a proposal out for it's inclusion.

Once a transaction can spend any other transaction in a channel though, what's to keep your cheating partner from just over-writing the correct channel balance with another, older transaction? eltoo solves this in a hackishly elegant manner, by repurposing the nLocktime field that already exists as a part of the transaction format. The locktime field already serves a dual purpose: it either limits the spending of this transaction by a required blockheight, or by a timestamp. If the locktime is beneath 500k, it assumes that it's a block height lock. Anything above this is a timestamp lock. Any number above 500k but beneath the current time, then, can safely be used as a series number for a set of eltoo channel transactions, as they will be immediately spendable. Eltoo transactions can be ordered by locktime. Update transactions are configured such that they can only be spent by a transaction that has a higher locktime than their own, thus preventing an earlier transaction from overriding a later, committed transaction.[1]

eltoo is incredibly elegant in its simplicity. Floating, ordered update transactions obviate the need for arsenals of penalty transactions. They also do away with the need for secrecy -- any update transaction is valid and publishable. This greatly reduces the memory overhead needed for each channel that a lightning server maintains, as old state update transactions can be safely discarded.

There's a bit more nuance to update and spending transactions that I've largely glossed over. You can read more in the eltoo paper itself. There's a lot of other really nice features that eltoo adds, like making multi-party channels feasible, and a better mechanism for dealing with fee spikes.

As it stands currently, eltoo won't be possible until the SIGHASH_NOINPUT flag has been added to the spec, but I'd be surprised if it isn't included within the next few months to a year.

For further reading, on the original Lightning spec, see Poon and Dryja's paper, The Bitcoin Network, Scalable Off-Chain Instant Payments.

[1] The key to understanding how this works is to know that the CHECKLOCKTIMEVERIFY doesn't compare with the actual time, but rather to the nLocktime specified in the transaction. See BIP65.

#bitcoin #lightning #eltoo #explainer
25 Oct 2018 c.e.
Reproducible builds with Bitcoin, Tor and turtles

Within the last few years, modern open source software, particularly ones that deal with vulnerable or important systems[1], have worked to make the built binaries that they publish for download verifiable. A binary is considered verifiable if anyone can download the source code, build it, and end up with a binary that exactly matches the publicly available ones.

Ending up with an exactly matching binary, however, is a non-trivial task. As such, software projects such as Bitcoin and Tor have undertaken the project of making their builds deterministic and reproducible.

There's two parts to ensuring deterministic, reproducible builds. The first is to eliminate any amount of non-determinism from your build itself. The second is to remove any amount of non-determinism from your build environment.

The first is typically managed by removing timestamps, fixing the order of outputs or inputs via a stable sorting algorithm, and stripping out variable version information. You can see a more through treatment of the various sources of non-determinism in builds on the Reproducible Builds working group's website, under the heading "Achieve Deterministic Builds".

The second, removing non-determinism from your build environment, typically entails creating a clean room container that the build will take place in.

As part of an investigation to better understand reproducible builds, I spent some time walking through both the Bitcoin and the Tor project's reproducible build process. What follows is a short overview of how the Bitcoin reproducible build process works and a comparison between the Tor project's build process and Bitcoin's. Finally I'll talk a bit about turtles, a work in progress project that takes the trustworthiness of reproducible builds a step further.

Building Bitcoin

Bitcoin, in keeping with the spirit of the project, relies on a public, multi-party verified binary. Any individual can download the source code for bitcoin core from Github, check out a tag, and then build the project. Gitian, the build verification tool that Bitcoin makes use of, outputs an assert file that lists all of the inputs, outputs, and packages used to build the source along with the SHA256 hash of each of them. An independent verifier then signs this file with their PGP key and submits a pull request to the gitian.sigs repository. I did this a few days ago, for the linux and windows versions of the binaries; you can see the PR I submitted here. It's got two assert files, one for the linux binary and one for the unsigned Windows binary, plus two separate PGP signature files, which is just the assert file signed with my PGP key.

Bitcoin's reproducible build setup uses a script-wrapped version of the Gitian builder. There's a couple of steps involved in setting up your machine to do a reproducible build and this doc in the bitcoin-core notes does a good job of walking through the specifics; I'll go over them briefly.

First, you need to decide how and where you're running Gitian itself. If you're running a Debian/Ubuntu distro you can set up a secondary user on your computer to 'host' Gitian builds. You can also set up a virtual machine (via Docker or VirtualBox) to host Gitian builds within. I'm sort of lazy, so I went the route of setting up a secondary user on my machine to run Gitian in.

Once you have Gitian configured, you can start the Bitcoin build process. Mostly, this involves cloning the bitcoin gitian.sigs repository (where it'll create the output asserts files for you) and then running gitian-build.py, a script included in bitcoin/bitcoin's contrib directory.

gitian-build.py acts as a wrapper around gitian-builder, that organizes the build process into a few general tasks: setup, build, verify, and sign. Once you have Gitian setup, you can run the verified build process with the following command, where username is your github handle, and 0.XX.0 is the source tag that you'd like to build and verify.

./gitian-build.py --detach-sign --no-commit -b username 0.XX.0

The gitian-build.py Bitcoin script has a couple of extra options, that you can run, for instance, using -B instead of -b will also build signed versions of the Windows and OSX binaries. To be honest, I'm not sure what the difference is between a signed and non-codesigned binary, but there's options for both!

You may notice that my PR for verifying the 0.17.0 Bitcoin binaries is missing the OSX binaries; in order to compile the OSX binaries (particularly on a non OSX device) you need to install extra packages.

There's a few more pieces of Bitcoin's Gitian setup that I should mention. Both the gitian builder and the gitian-build.py scripts are wrappers for ordering and running the actual build script. So where is the actual build script for bitcoin? They're defined as a set of YAML files, in bitcoin/contrib/gitian-descriptors. There's one for each of the build targets (linux, windows-nonsigned, windows-signed, mac-nonsigned, mac-signed). The gitian-build.py file is hardcoded to load these YAML files into gitian-builder, depending on what options you've passed it and your current system setup (ie do you have the OSX dependencies downloaded?).

Now that we've got a pretty good idea of what the setup is that Bitcoin's established for the build, let's talk a bit about how Gitian itself works. Gitian itself will spin up a container (defaults to KVM, but can also be configured to use LXC) that it will download the listed dependencies into (these are specified in the YAML file), and then run the build script (also outlined in the YAML file).

After the build has completed, Gitian's gverify script is run against the built binaries, which outputs an assert file. These are the files that you sign and upload to Bitcoin's gitian.sigs repo, signalling that you also have independently verified the Bitcoin binary!

Notice that if you've setup Gitian to run inside a VM on your machine, the build itself will take place inside yet another container, one spun up by Gitian itself. It's a bit of a build turducken.

There's a number of reasons for this. The container that Gitian spins up has its time set to a known value, such that all builds use the same time, it also uses the same container architecture, which ensures that the file system irregularities are hopefully largely eradicated. Having users to use the same architecture for the build machine removes a good deal of possible variability from the build process.

Comparing Bitcoin and Tor

To be honest, I didn't spend a lot of time digging into the Tor build process, but a few things about it stuck out. First, they use something called rbm, which is based on runc, another container based solution. They used to have a Gitian driven process, but moved away from it - their original Technical Details blog post on reproducible builds mentions Gitian, but their links to the build setup instructions leads to a git commit about how it's been deprecated. I didn't do a deep dive into how rbm or runc works, but this comment on their docs leads me to believe that I'm not missing much.

We have written a pair of blog posts that describe in more detail why this is important, and the technical details behind how this previously got achieved when using the Gitian system, if you are curious. The new build system based on rbm is working similarly and is facing pretty much the same issues.

The new rbm based process isn't well documented, but appears to be invoked when you run make - in this way it's much easier than the Bitcoin Gitian build process, as you run the rbm process every time you build the project.

Finally, Tor doesn't have a set of signed verified assert files, rather you can individually build and check the shas yourself. It feels incredibly Bitcoinic to have a publicly available set of signed signatures that verify the build, whereas with Tor you have to do it independently, if you care, rather than having a published set of signed verifications.

Turtles

There's a problem with Gitian, however, in that you're largely depending on the binaries of the packages you download from Debian being uncompromised. If, for some reason, the gcc compiler binary that you've downloaded from the Debian repo is compromised, all of the software that you compile with it will also be compromised. Gitian trades off relative speed for some amount of trust in the Debian packages.

There's another project[2] that's currently in the works that would replace Gitian. Instead of downloading a VM to run a build in, you'd instead build an entire runtime from scratch, first by downloading the source for a compiler, and then compiling another version of gcc, etc. What were packaged binaries that were downloaded in a VM in Gitian are now source compiled on your machine. The project's still a work in progress, but if successful, it would both decrease the required, systematic trust for build verification, as well as potentially exponentially increase the time required, at least on first run.

If you're curious, you can check out turtles here.

[1] Bitcoin (money) or the Tor project (anonymity) are two examples that currently have a public reproducible build process. Debian (computing platform) is currently in the beginnings of an enormous effort to create a reproducible build process for all of the packages that they publish.

[2] Thanks Carl for clueing me in to the work you've been doing on this!

#bitcoin #tor #reproducible-builds #turtles
More...