An Interesting Gotcha

QUESTION: Do you have an interesting solution to a problem you can't shoehorn into a tip category?

ANSWER: Yes, as a matter of fact I do. I just ran into this one today. Here is how I described it to the IDL newsgroup community.

I just spent a pleasant hour or so chasing down an interesting WHERE function gotcha. I thought you might be interested.

I have an alphabetized string array:

   array = ['apple', 'avacado', 'banana', 'carrot']

I wish to make a list of those vegetables (I think of them as vegetables) that begin with the letter "a". I want this to be fast (there are several hundreds of entries in my real array), so I plan to search for byte values.

   index = WHERE( (Byte(array))[0,*] EQ Byte('a'), count) 
   Print, count
      1

Uh, huh. (Bit of head scratching here.)

I probably did the extraction incorrectly. Try this:

   veggie_letter = (Byte(array))[0,*]
   Print, Reform(veggie_letter)
      97  97  98  99
   letter = Byte('a')
   Print, letter
      97

Uh, huh. Let's see, "One, two 97s in there." Well, that's interesting. :-(

How about this:

   Help, veggie_letter, letter
      VEGGIE_LETTER   BYTE      = Array[4]
      LETTER          BYTE      = Array[1]

"LETTER, a byte *array*!? You don't suppose..." Try this:

   index = WHERE( (Byte(array))[0,*] EQ (Byte('a'))[0], count) 
   Print, count
      2

Hummm. V-e-r-y interesting...

Now I know how to fix the problem, but I don't know exactly what the problem is. (Although this is not so different from most computer problems, when you come to think of it.) Is the problem that the BYTE function always makes a byte *array* when extracting string arguments? Or is it that the WHERE function acts in a, uh, non-intuitive way when there are two vectors in a boolean expression?

And how does this WHERE expression work, anyway? Why don't I get errors? How can I exploit a boolean expression involving two vectors?

As usual, more questions than answers when you look deeper. Any ideas? :-)

As usual, Craig Markwardt replied. He had this to say.

You are being bitten by the "feature" that I love so much. Namely that in IDL, when you do X OPERATION Y, and X and Y are both arrays, then the expression is trimmed to the smaller of the two arrays.

So it's not anything special regarding WHERE, or boolean expressions, but rather that VEGGIE_LETTER EQ LETTER evaluates to a 1-element array. One element, because LETTER only has one element:

  IDL> help, veggie_letter EQ letter
  <Expression>    BYTE      = Array[1]

There are many times that I wish that IDL has an easier way to get the ASCII value of a character.

By the way, why not do this instead?

  index = where( strmid(array,0,1) EQ 'a')

This avoids the whole issue of converting to a different representation, and it just looks less gobbledygooky.

Happy guacamole,

Craig

Of course, I would have done that, but it didn't make nearly as interesting a story. :-)

Google
 
Web Coyote's Guide to IDL Programming