Mar 102012
 

By default, Google tracks every search result you click on. They do this surreptitiously: URLs in Google search results appear to go directly to the destination:

 

But, upon click, URLs in Google search results change to go to Google first!

 

Straight Google removes this tracking from Google URLs across all Google products. Easy to install and no configuration needed, but you must install Greasemonkey first.

Mar 022012
 

Starting with OS X Lion, holding down a key will bring up a menu of alternate characters rather than repeating the key. (This is a feature). There are many tips on how to re-enable key-repeat globally. But you can also control the behavior per-application (thanks, Egor Ushakov). This is handy for e.g. IntelliJ or RubyMine, or any other app that provides Vim-style keyboard bindings. The magic commands are:

% defaults write com.jetbrains.intellij ApplePressAndHoldEnabled -bool false
% defaults write com.jetbrains.rubymine ApplePressAndHoldEnabled -bool false

But how do you figure out what the magic identifier for your application is? Simple: defaults domains will list them all:

defaults domains | gsed -e 's/, /\n/g' | grep jetbrains
com.jetbrains.intellij
com.jetbrains.intellij.ce
com.jetbrains.rubymine
jetbrains.communicator.core

Note that in order to munge the commas into newlines for grep, gsed was required because OS X default sed cannot (easily) insert newlines.

Feb 212012
 

After my previous adventures in slicing and dicing a huge XML file, I wanted a means to randomly select files. But first, the directory had so many entries it was unwieldy on my laptop. The Python script below divvies the files up into directories of up to 1000 files each. (Adaptable to other contexts via slight tweaking of the filename regex and subdir name generation.)

#!/usr/bin/python
import os
import re
where = '.' # source directory 

ls = os.listdir(where)
for f in ls:
  m = re.search('.*_COMM-([0-9]+).xml', f)
  if m:
    subdir = "%03d" % (int(m.group(1)) / 1000)
    try:
      os.mkdir(subdir)
    except OSError as e:
      pass
    os.rename(f, os.path.join(subdir, f))

Now on to the random selection, again with Python:

#!/usr/bin/python
import os
import random
import re
import sys

if len(sys.argv) > 1:
  where = sys.argv[1]
else:
  where = '.' # source directory 


subdirs = filter(lambda x: re.search('^[0-9]*$', x), os.listdir(where))
subdir = os.path.join(where,random.choice(subdirs))
print os.path.join(subdir,random.choice(os.listdir(subdir)))

A quick shell loop leverages the Python script to grab files and dump into a repository of test data. Works on ZSH, Bash, perhaps others:

for i in {1..250}; do cp $(./pick_a_file.py sub_dir_with_files) /destination/dir/filename_prefix_$(printf "%03d" $i).xml; done;

 

 

Feb 142012
 

There is lots to be said about the intricacies of IMAP delete flags vs. actual expunging of deleted messages and the confusion caused when something is merely flagged for deletion and the user expected it to be really gone. This post is not about that. Everyone agrees that once a message is expunged, it definitely should be gone. But sometimes expunged messages still display in Thunderbird!

I often observe this:

  1. Delete message on the way to work using K-9 on my phone.
  2. Arrive at work and message is gone from my Inbox in Mail.app
  3. Come home, download new mail in Thunderbird and see an Inbox full of undead messages.

No amount of re-expunging and re-fetching mail helps. Grepping through the server-side Maildir shows the messages really are gone from the folders in which Thunderbird is still showing them.

It turns out the reason they are still displaying in Thunderbird is mundane client-side index corruption. To clean things up:

  1. Right-click on mailbox
  2. Choose Properties...
  3. Click Repair Folder
  4. Rejoice at tidy mailbox
Feb 082012
 

Often Array(arg) is used for this, but is flawed. Note the last result when applied to a Hash:

> Array(42)
 => [42] 
> Array([1,2,3])
 => [1, 2, 3] 
> Array(nil)
 => [] 
> Array("foo")
 => ["foo"] 
> Array({"foo" => "bar", "biz" => "baz"})
 => [["foo", "bar"], ["biz", "baz"]]

What went wrong is that Array() calls the (now deprecated) to_a on each of its arguments. Hash has a custom to_a implementation with different semantics. Instead, do  this:

class Array
  def self.wrap(args)
    return [] unless args
    args.is_a?(Array) ? args : [args]
  end
end

That yields the expected results, even for Hashes:

> Array.wrap(42)
 => [42] 
> Array.wrap([1,2,3])
 => [1, 2, 3] 
> Array.wrap(nil)
 => [] 
> Array.wrap("foo")
 => ["foo"] 
> Array.wrap({"foo" => "bar", "biz" => "baz"})
 => [{"foo"=>"bar", "biz"=>"baz"}]

Use of is_a? is deliberate; duck-typing in this situation ([:[], :each].all? { |m| args.respond_to? m }) yields unexpected surprises since e.g. String is Enumerable and would not get wrapped.

For further discussion see Ruby-forum thread “shortcut for x = [x] unless x.is_a?(Array)” and StackOverflow “Ruby: Object.to_a replacement“.

 Tagged with:
Feb 082012
 

Slicing up XML files is best done with an XML parser. (Regular expressions, csplit, etc. are too easily confused by arbitrary strings in CDATA sections.) xml_split (may be obtained with CPAN by installing XML::Twig) mostly does the trick. Given a file like:

<?xml version="1.0" encoding="UTF-8"?>
<foo:Root xmlns:foo="http://www.foo.bar/fnarf/foo">
  <foo:child>
    ...
  </foo:child>
  <foo:child>
    ...
  </foo:child>
</foo:Root>

…xml_split can create many files, each containing:

<?xml version="1.0" encoding="UTF-8"?>
<foo:child>
  ...
</foo:child>

However, this loses the namespace declaration and the enclosing root element. Luckily, a little sed magic can bring those back:

find . -name '*.xml' | xargs -n1 sed -e '1 a\ 
<foo:Root xmlns:foo="http://www.foo.bar/fnarf/foo">
' -e '$ a\
</foo:Root>
' -i ''

find lists all the files, xargs invokes sed on them one by one (-n1), and sed adds the opening tag with namespace declaration after the first line (1 a) and the closing tag after the last line ($ a). Now each file looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<foo:Root xmlns:foo="http://www.foo.bar/fnarf/foo">
  <foo:child>
    ...
  </foo:child>
</foo:Root>
Nov 272011
 

The battery in my APC Back-UPS BR 800 was worn out after years of service, so I bought a replacement from APC.com. However, apcupsd still reported zero runtime and erratic charge and load percentages. I did some manual recalibration attempts (charge fully, discharge completely using constant load). This got my estimated runtime from zero up to a few seconds, but the UPS was still not useful. A couple seconds of power outage would lead to bogus critically low battery readings and trigger automated shutdown. (Despite the fact that it took me about half an hour to complete the run-down under a similar load.)

This evening I was preparing for another recalibration attempt by looking for a way to disable the beeping when power is disconnected. It turns out apctest can disable the alarm. In the process, I noticed apctest can also read and write the battery date. On a whim, I updated the date. And, magic: Merely changing the battery date fixed the reported runtime, charge, and load percentages!

Changing the battery date back to the original value did not bring the bogus readings back. Presumably the behavior is based on dead-reckoning of time elapsed since last battery change rather than any knowledge of what the current date actually is.

Nov 172011
 

[ERROR] error: File name too long

A common way to install Ubuntu is with an underlying ext4 file system and eCryptfs encrypted home directories. ext4, like many other file systems, has a maximum filename limit of 255 bytes. eCryptfs creates filenames much longer than the original. Compiled Scala classes tend to have long file names since anonymous classes end up in their own files. Therefore, when compiling Scala projects within eCryptfs on ext4, it is easy to get file name too long errors. 🙁

Oct 042011
 

Ruby, Python, and many other dynamic languages have a so-called splat operator that lets you easily invoke a function by providing a list of argument values:

def f(x,y)
  x*y
end

> fArgs = [6,7.0]
=> [6, 7.0]

> f(*fArgs)
=> 42.0

Scala does not have a splat operator per se, but you can achieve the same effect without too much work. Sadly the syntax is different for fixed-arity and variadic functions.

Scala splat for variadic functions

For variadic functions there effectively is a splat operator. If you invoke a variadic function and append :_* to the argument the compiler will perform the splat:

> def g(xs:Int*) = (0 /: xs) (_ + _)
g: (xs: Int*)Int

> val gArgs = List(1,2,3,4)
gArgs: List[Int] = List(1, 2, 3, 4)

> g(gArgs:_*)
res23: Int = 10

Scala splat for fixed-arity functions

> def f(x:Int, y:Double) = x * y
f: (x: Int, y: Double)Double

> val fArgs = (6, 7.0)
fArgs: (Int, Double) = (6,7.0)

> f _ tupled fArgs
res8: Double = 42.0

Magic! The first part, f _, is the syntax for a partially applied function in which none of the arguments have been specified. This works as a mechanism to get a hold of the function object. tupled returns a new function which of arity-1 that takes a single arity-n tuple. It is defined in the Scala Function object,

However, given a List of arguments to pass to f, I’m not sure how to easily convert the List to a Tuple.

p.s. There’s a stackoverflow post about this called “scala tuple unpacking.”