Raf — a colleague — gave me the following problem:
Given a String (x) containing only characters a-z, write a function (f) that returns a base 10 integer, which converts the String as if it were a base 26 numeral. Function f is bijective.
Here are some example runs:
x | f(x) empty | 0 a | 1 b | 2 z | 26 aa | 27 az | 52 ba | 53 bz | 78 aaa | 703 aaz | 728 aza | 1353
Using your preferred programming language, implement function f.
My solution is in a comment on this page. What is yours?
scala:
def raf(s: String) = (s.toList :\ 0) (_ – ‘a’ + 1 + _ * 26)
Haskell:
f :: String -> Int
f = foldr comb 0 where
comb x n = (fromEnum x – start) + (n * 26)
start = fromEnum ‘a’ – 1
This is one case where the Scala version looks a little nicer, to my eyes.
PS. Damn WordPress ate my pre tags… oh well.
PPS. DOUBLE WHOOPS. Of course it should be a foldl:
f :: String -> Int
f = foldl comb 0 where
comb n x = (fromEnum x – start) + (n * 26)
start = fromEnum ‘a’ – 1
This is EXACTLY why I prefer /: and :\!
This isn’t really a base-26 numeral though right? There is no “zero”…
There is a 0. It is the empty String.
Ruby:
def f s
s.split(“”).inject(0) {|i,c| i*26+c[0]-?a+1 }
end
This assumes lowercase inputs only – if that’s not the case, add “.downcase” before “split”.
And I agree with Mostafa – it’s not what I’d normally consider base 26. To rephrase his statement: There is no zero _digit_. If it had been proper base 26, then “a” would’ve been 0 and “ba” would’ve been 26 etc. To handle “proper” base26 with the Ruby above, all you do is remove the “+1″.
Incidentally, this example demonstrates a pet peeve of mine with Ruby: It feels much more logical to me to me to treat a string as an array of unsigned integers representing the characters. As such, I really would’ve preferred to be able to do this:
def f s
s.inject(i) {|i,c| i*26+c-?a+1}
end
But inject (and each, collect etc.) on String yields the whole string, and each_byte yields signed integers. Blah.
It is base 26. There is a zero.
Given:
data Nat = Zero | Succ Natand:
data Nat' = One | Succ' Nat'then
NatandNat'are isomorphic. Therefore, substituting an isomorphism has no effect on whether or not a base 26 system is being used.J:
f=:26#.96-~3 u:]
In Python:
def f(x):
num = 0
for place, char in enumerate(string[::-1]):
num += (ord(char) – 96) * (26 ** place)
return num
Another haskell solution:
f :: Enum a => [a] -> Int
f = sum . map (uncurry (*) . \(x,y) -> (26^x,fromEnum y – 96)) . zip [0..] . reverse
Oops, should have been ‘x’, not ’string’ there.
C:
// I've not tested this, I think it's right...
size_t f(const char* buf) {
size_t x = 0;
for(int i = 26; *buf; i *= 26, ++buf)
x += (*buf - 'a') * i;
return x;
}
Tony,
You may be right, though the argumentative programmer in me still doesn’t really quite agree with that use (see below) – some googling shows me this is called bijective numeration. It’s the first time in 28 years of programming I’ve ever come across it, though, so forgive me for expecting to have a digit of value zero…
I have a deep rooted expectation of having a digit of value zero so that I can add a zero to the right hand of a number to multiply the number by the base, and I think you’d find that to be a very common expectation among programmers.
While it’s probably not wrong to call bijective base-n just base-n, I think this is a case of (obscure, to me anyway) mathematical terms being slightly at odds with how most people use the term in terms of programming, and it smells to me because it makes the common case more long winded, and seems destined to cause confusion (as indeed it did here)
Googling some more shows that for the most part people seem to have the same expectation as me when it comes to the meaning of the term “base-n”, _except_ for base-26 where there’s wild confusion and some pages just assume you’ll start with A=1 and others just assume you’ll start at A=0, not to mention the ones that just assume A=10 just as we tend to do for hex.
The A=1 version is seemingly because of spreadsheets (converting column labels to a numeric position matching the row numbers), which I guess makes sense. Almost the exact same question as yours have apparently come up on a huge number of sites because of this.
So I guess I got to learn about bijective numerals, though I suspect there’ll be a very long time before I’ll need that bit of knowledge…
Yet another Haskell solution:
import Data.Char (ord)
f = let s = ord ‘a’ – 1 in foldl (\a d -> 26 * a + ord d – s) 0
a little python one liner:
conv = lambda s: sum([((ord(digit)-ord('a')+1) * (26**place)) for (place, digit) in enumerate(s[::-1])])
Doing it in C++ is tricky since you can’t use std::for_each on its own to do this without creating a functor to keep track of your exponent. C++0x has lambda closures though. Fortunately there is Boost.Lambda right now…
C++ using Boost.Lambda:
size_t f(std::string s) {
size_t x = 0, exp = 26;
std::for_each(s.begin(), s.end(),(var(x) += (_1 -'a') * var(exp)), (var(exp) *= 26));
return x;
}
In C/C++:
int f(char *s)
{
return *s?*s-96+f(s+1)*26:0;
}
Should probably have used std::accumulate there heh
In Groovy:
def int f(String x) {
def result = 0
x.reverse().eachWithIndex { c, i -> result += (c.bytes[0]-96) * (26**i)}
return result
}
Perl:
sub f
{
$num = $num * 26 + ord() – 96 for split //, shift;
$num;
}
erlang
raf(S) -> lists:foldl(fun(D,A) -> (A*26)+D-$a+1 end, 0, S).
In Java:
int f(String s) {
final char[] chars = new StringBuffer(s).reverse().toString().toCharArray();
int res = 0;
for (int i = 0; i
Ouch, I hate the web sometimes, here’s another go – Java:
int f(String s) {
final char[] chars = new StringBuffer(s).reverse().toString().toCharArray();
int res = 0;
for (int i = 0; i
Yet another python one-liner:
def f(s):
return sum([(ord(c) - ord('a') + 1) * 26**i for i, c in enumerate(s[::-1])])
Several responses in nearly as many languages, but none that seem particularly readable at first glance (not to mention particularly safe to use in some cases). Fairly typical of any of these simple programming problems then — they seem to bring out the hacker in us.
Perl, not the nicest solution, but what came to mind first:
sub f { my ($r, $i); $r += $_*26**$i++for map { ord() – ord(“a”) } reverse split//, shift; $r }
And add the missing +1 in the map.
Perl, readable and input validating version:
sub f
{
my $alfa = (shift or ”);
my $num = 0;
if ($alfa and not $alfa =~ /^[a-z]+$/) {
die “Parameter error”;
}
for my $a (split //, $alfa) {
$num = $num * 26 + ord($a) – 96
}
return $num;
}
Another Python solution:
def f(x):
return reduce(lambda x,y: 26*x + ord(y)-ord(‘a’)+1, x, 0)
PHP:
function f($x)
{
return intval(array_reduce(array_filter(str_split($x)), create_function(‘$v,$w’, ‘return $v * 26 + ord($w) – 96;’)));
}
Well, it is certainly neither elegant not recursive, but it does avoid having to reverse the String like I’ve seen in other examples.
Here is how I would have done it in Java:
public int f(String x) {
if (x.length() == 0) {
return 0;
}
int value = 0;
int multiplier = 1;
for (int i = x.length() – 1; i >= 0; i–) {
value += (x.charAt(i) – ‘a’ + 1) * multiplier;
multiplier *= 26;
}
return value;
}
Scala:
def f(x:String):Int = {
def fx(s:List[Char], p:Int): Int = s match {
case Nil => 0
case x::xs => (x – 96) * Math.pow(26,p).toInt + fx(xs, p+1)
}
fx(x.reverse.toList,0)
}
Too long, now that I see other solutions!
Hey, trying to grok Walter Chang’s solution I find that it doesn’t work. For values like “a”, “aa”, “aaa” it does.
I got it, he only forgot to reverse the string!
Sorry Walter
Walter’s solution doesn’t require the call to
toListsince :\ is defined on Scala’sString.Another java-solution, restricted to longs, without Jeffs 0-guard:
public long raf (String s26)
{
long res = 0L;
for (int i = 0; i
btw.: a hackers-blog without indentation is annoying, imho.
No – I don’t have suggestions where to get a better one.
Another java-solution, restricted to longs, without Jeffs 0-guard:
public long raf (String s26)
{
long res = 0L;
for (int i = 0; i < s26.length (); ++i)
{
char c = s26.charAt (i);
int j = c – ‘a’ + 1;
res *= 26;
res += j;
}
return res;
}
btw.: a hackers-blog without indentation and without code-masks is annoying, imho.
No – I don’t have suggestions where to get a better one.
“No – I don’t have suggestions where to get a better one.”
Write one!
in Python:
reduce(lambda x, y: x * 26 + (ord(y) – 96), “aza”, 0)
or a bit longer:
reduce(lambda x, y: x * 26 + y, map(lambda x: ord(x) – 96, “aza”))
or in Haskell:
foldl (\x y -> x * 26 + Char.ord(y) – 96) 0 “aza”
There is no zero digit.
It’s not base 26. Otherwise, “a” would be zero and “z” would be 25.
Vidar Hokstad was right.
I probably should elaborate. Consider base-10.
The empty string is not a number.
0 is 0.
1 is 1.
2 is 2 and so forth.
00 is 0.
000 is 0.
etc etc.
Now in base-26.
The empty string is not a number.
a is 0.
b is 1.
etc
aa is 0.
aaa is 0.
This is not to say that there is no challenge in your puzzle. It’s just not base-26.
@Stefan: I like your solution. I wish I had thought of it!
We can post formatted code using <code> tags and &nbsp; I think.
(http://codex.wordpress.org/Writing_Code_in_Your_Posts)
Test:
public long raf (String s26)
{
long res = 0L;
for (int i = 0; i
Oupse, looks like I was wrong! It was working in the preview at least
Dave,
When I said “the empty String is 0″, I had misinterpreted the question. Now that I understand the question, I explained why it is indeed a base 26 system being used.
Here is another explanation:
There are 26 data constructors for one type (a-z)
There are 10 data constructors for another type (0-9)
There is some bijective and total function between these two types.
This function (by implication) is converting between a base 26 and base 10 system — you can call the data constructors anything.
- QED
Ah yes. A bit of googling lead me to the phrase: modified base-k positional system also called a bijective numeration. This is the system you are describing….. And now I discover that this was mentioned above.
– Haskell, recursive
– will error if passed a string with a non-lowercase or non-alphabet char
import Data.Char(ord, isAsciiLower)
f :: [Char] -> Int
f [] = 0
f s | isAsciiLower (head s) =
(ord(head s) – ord(‘a’) + 1) * (26 ^ ((length s) – 1)) + (f (tail s))
As I am late, I give a bit more :
Scala :
def intToNBase(base : Int)(n : Int) : List[Int] = {
def inner(ntemp : Int, res : List[Int]) : List[Int] = {
if(ntemp == 0) res
else if (ntemp == 1) 1::res
else inner(ntemp/base, ntemp%base::res)
}
inner(n,List())
}
def intToBinary(n : Int) = intToNBase(2)(n)
def NtoIntBase(base: Int)(n : List[Int]) : Int = {
def inner(tempN : List[Int], res : Int) : Int = tempN match {
case Nil => res
case x::xs => inner(xs, x+base*res)//x +base*(inner(xs))
}
inner(n,0)
}
def f(x : String) = NtoIntBase(26)(x.toLowerCase.toList.map(_ – ‘a’+1))
Better late than never! (as they say)
In Clojure:
(defn f [s] (reduce #(+ (- (int %2) 96) (* %1 26)) 0 s))