Categories
Uncategorized

Needs More Crypto

Various problems that could be solved by the application of cryptography

Phone Spam and Scams

The problem of phone spam is not knowing that the caller is who they say they are. We already have solved this problem on the web. It’s the same problem as knowing that you’re getting your bank website and not some hacker. Usually when web security is explained to lay people, the focus is on the fact that traffic is encrypted to prevent eavesdropping. Of course this is important since web traffic hops across many untrusted routers and servers between its origin and destination. But what is also crucial is knowing that the destination is who you want it to be. Encryption alone only guarantees that you are being hacked by no more than one hacker.

Phone calls need to be initiated over a protocol that validates cryptographic certificates to confirm that the other party is who they say they are.

This would solve both the problem of spammers bugging regular people and also scammers imitating customers to businesses. I had an experience the other day where I called my credit card customer service and as soon as I connected to a person the call dropped. This happened twice. I suspect they thought I was a scammer because I was at the time logged into my account on my work computer. My work computer is on a VPN which I’ve heard sometimes routes our traffic through India, probably related to the fact we have a team in Bangalore.

If my phone had a trusted certificate, my bank could have more confidence that it was me calling.

Transferring Medical Data

I had a bothersome time recently getting new glasses. I didn’t want to buy into the Luxotica cartel, so I used a popular startup. To get my prescription they offered to call my eye doctor themselves. The eye doctor tried to fax it over but that failed for whatever reason. I’m not sure why I can’t be trusted to relay my prescription to the eye glass store. But let’s assume there’s a good reason. My prescription could have been sent using PGP encrypted email. And I wouldn’t have to ask my eye doctor to send the prescription to the store. They could send it to me, along with their PGP signature that references a public key on the popular key-servers. This would validate that the prescription wasn’t altered by me or any other intermediary. I’m not sure how this works with HIPAA compliance, but I don’t know of a good reason this wouldn’t work. There’s just the bad reason that the secure email market is dominated by non-interoperable proprietary solutions.

Resumes

Study: Job Applicants With 4-Year College Degree Just As Successful As Those Who Lie About Having 4-Year College Degree

The Onion

I don’t know how real of a problem this is, applicants lying about degrees or experience on their resumes, but I can imagine a cryptographic solution. That would be a digital resume format which contains for each employer, a cryptographic signature for that portion of your resume. When checking the references for a candidate, a prospective employer can call up previous employers to verify dates of employment. But we can cut out the manual steps. This can be done the same way certificate-based digital signatures work with PDFs with the only difference being that instead of one signature validating the whole document, we’d have multiple different signatures, each validating only a portion of the document. The infrastructure costs could be minimal if public keys are hosted on the traditional networks of key-servers used with PGP.

Categories
Uncategorized

Two new creational design patterns

Polluting Factory Method

I named this the polluting factory method pattern because it relies on mutation. The nifty thing about this pattern is if you squint really hard, it looks like you have named parameters, or especially if you use a multi-line lambda, it looks almost like an object literal.

What I like about this pattern is that there is a clear distinction between required parameters and optional ones. Also, by using the most succinct class syntax in Java, there is relatively little boilerplate.

Lockstep Builder Pattern

 

The heart of this pattern is having a new class for each required field and a single method on each builder class which returns the next step. Compared to the builder pattern, this goes in the opposite direction. It is very verbose. The point of this pattern is to make object creation as easy as possible. By having only one method on each class, your IDE will practically walk you through creating the object step by step. If you instantiate an object often enough, eventually it might be worthwhile to invest in designing the class this way. Or if you use some form of code generation, that might also tip the scales to make this worthwhile.

Categories
Uncategorized

Straw Man Proposal: Every Regex Should Have Its Own Class

Regular expressions are commonly written very casually on the fly based on some known examples. Regexes are densely packed with logic that is often a matter of one’s personal style as much as intentional decisions about what that regex should match or not. Many choices are overlooked or made unintentionally by the platform executing the regex. Some examples include whether or not to match across lines, or whether to be greedy (if the author even knows what that means).

A regular expression is usually pure implementation (unless it has embedded comments, which I’ve yet to see in the wild). I have a rule of thumb that most code logic should have two parts: what and how. Any non-trivial piece of logic should be wrapped in a function or class so that the next person coming by doesn’t have to execute the logic in their head to know what it’s doing. They can assume that the code does what it says it does unless they have further reason to doubt it. You could say this is another way of talking about the Single Level of Abstraction Principle.

The most important reason to give every regex its own class is for unit testing. Every regex should be accompanied with a set of examples of what it’s intended to match. Every regex represent bugs waiting to happen, so creating it initially with a set of unit tests prevent regressions of the original test cases and encourage accumulation of additional regressions tests.

Unit testing is a great mental hack to get around happy-path bias. I think regexes are naturally prone to happy-path bias.

Counter: Why not just a function?

Response: Not a bad point. I’m more confident in stating the proposal “Don’t use a regex directly”. In the programming cultures I had in mind, by which I mean those passionate about testing, static functions are frowned upon to the point that even if there’s not a good reason against one in a particular case, a true class is considered better style probably for consistency’s sake. In an FP codebase, I wouldn’t begrudge a regex wrapped in a function.

Counter: What about a checklist for writing regexes? To make sure you’ve considered subtleties like greediness.

Response: That makes sense in the imaginary world where code is written once and seldom changed. In the real world where code is a living document, tests ensure continued compliance.

Categories
Uncategorized

If a wheel keeps getting reinvented, the most important thing is for everyone to share the test cases that drove them to reinvent the wheel again.

Categories
Uncategorized

The Sorting Hat from Harry Potter is really a hash function.

Categories
Uncategorized

How to install Perl 6 in Ubuntu

The old version of this post was stupid. Just use https://github.com/nxadm/rakudo-pkg

Categories
Uncategorized

What is parsing?

This is not a pipe.

This is not the painting entitled “The Treachery of Images” by Rene Margritte.

This is an image of the painting “The Treachery of Images” by Rene Margritte.

123.45

This is not a number.

This is a piece of text containing numerals, symbols which have numeric values associated with them, each individually, and also together as a whole.

Parsing is the process of interpreting the representation of an idea to get at the idea itself.

Categories
Uncategorized

Podcasts

I’ve been a big podcast listener for several years. Here’s roughly the current list of podcasts I subscribe too, organized by how vehemently I recommend them.

Everyone Must Listen To

These are so good, it’s not worth explaining why, just listen to:

I Recommend

  • Planet Money 🔗
  • Tim Hartford 🔗
    • 50 Things That Made the Modern Economy 🔗
    • Pop-Up Ideas 🔗
  • Flash Forward 🔗
  • BBC Analysis 🔗
  • TED Radio Hour 🔗
  • EconTalk 🔗
  • Embedded 🔗
  • BBC World Service Documentaries 🔗
    • It’s downright humbling to realize how diverse the world is.
  • BBC Seriously… 🔗
    • This one gets extra credit for being so sonically interesting.
  • Seminars about Long Term Thinking – The Long Now Foundation 🔗

I also listen to

Which is a recommendation in itself, just less strongly than the above.

  • ProPublica 🔗
  • C-Span After Words 🔗
  • NPR Story of the Day 🔗
  • Codebreaker 🔗
  • Intelligence Squared 🔗
  • The Infinite Monkey Cage 🔗
  • Reply All 🔗

Honorable Mention

I don’t really listen to these, but that’s no fault of theirs. They are worth checking out.

  • Hardcore History with Dan Carlin 🔗
  • The Joe Rogan Experience 🔗
  • Song Exploder 🔗
  • Democracy Now! 🔗
    • These guys do great journalism. I’ve contributed to them. I just can’t spare an hour a day on the daily news cycle.
  • Death, Sex and Money 🔗
Categories
Uncategorized

I love the CockroachDB logo

cockroachdbI know nothing about design but this is a great logo. The two circular arcs that make up the body and antennae create a partial Venn diagram, referencing the set theory and relational algebra that form the theoretical foundation for this and any relational database. The shape on the back of the cockroach evokes a funnel, the universal symbol for filtering: a fundamental database operation.

Categories
Uncategorized

Git freebase

I’ve considered both rebase- and merge-based workflows for my projects, and I’ve come up with an alternative I’d like to propose as an enhancement to git.

I propose a command that would behave according to this pseudocode:

This has the following benefits:

  • It results in a clean history whenever possible
  • It highlights conflicts better than merging or rebasing

Traditional techniques in git are terrible at documenting conflicts. Conflicts are not easy to deal with. By their nature, they are encountered by only half of the people responsible for them. A prudent team should always review conflicts. In the best case, the conflict was preventable and the instigator needs to learn how to avoid creating conflicts going forward, e.g. by pulling more frequently, formatting frequently edited constants across multiple lines, or picking a random position for inserting new cases to frequently edited switch statements. In the typical case, at least both parties to a conflict should review the resolution.

A typical rebase completely hides conflicts, except when a user is diligent enough to document them in the commit message, although even in that case they will hardly pop out. It’s not even totally obvious where a rebase, successful or not, has happened. You have to notice that a commit has two different timestamps for when it was committed versus authored, and even then it might have been because it was cherry-picked.

A merge is almost as bad at documenting conflicts. gitk doesn’t show the changes introduced by a merge commit. This is bad news, because it allows totally new changes to be hidden in merge commits.

This technique serves to highlight conflicts in history. Any divergence+merge was a conflict. It sticks out like a sore thumb. And relative to a merge-only workflow, you still have an easy to follow, mostly linear history.

This strategy is also optimal in the rare but possible case in which a rebase encounters a conflict that a merge would not have. This happens when a conflicting change exists in an intermediate commit in one branch, but a subsequent commit leaves the tip of the branch in a state that doesn’t conflict. It should be clarified then, that a merge will happen anytime there is a rebasing conflict. It does not mean the conflict had to be resolved manually. In that case these merges will show up as sort of false-positives of truly bad conflicts, but I believe this is still the best that could be hoped for.

This could be implemented as an option to rebase. If it were to implemented as a separate git command, or for those who would prefer to alias it, I propose the name git freebase as it is similar to rebase, but it allows the user to be free of the fear of poorly resolved conflicts hidden in history.

Note: this author does not condone (nor condemn) the use of drugs.