Since 2017 I have been very busy with our new startup Voicegain. We have built a DNN based speech-to-text engine (ASR) and a whole platform around it.

The accuracy is pretty good – we are now pretty much on par with Amazon Speech to text. Here is blog post with our accuracy benchmark from just over two months ago. We will be soon publishing new benchmark with even better results.

Also coming soon is support for Twilio TwiML <Connect><Stream> which we think will be a better speech-to-text alternative for Twilio than TwiML <Gather>. We have already gotten some feedback from Twilio users and they are really looking forward to finally having support for standard speech grammars (GRXML and JSGF) on Twilio.


Something to help your kids understand physics

Is a head-on crash of two cars driving 50mph each equivalent to a crash of a single car driving 100mph against a solid wall? An easy answer if you think about the energy involved rather than about the force and action and reaction.

These two clips from Mythbusters will make a physics lesson cool and memorable for kids, and the rest can just enjoy the carnage.

How-to Guide to Climate Data Reconstruction Methods

Iowahawk has a step-by-step tutorial on how to reconstruct unavailable past temperature data from available past proxy data using statistical models built from available recent data.

Real (blue) and modelled (red) temperature data.

All you need to do re-do the tutorial is:

1. A computer. Which I assume you already have, because you’re reading this.

2. The illustrative spreadsheet, available as an Open Office Calc document here, or as a Microsoft Excel file here. Total size is about 1mb.

3. A spreadsheet program. I highly encourage you to use Sun’s Open Office suite and its included Calc spreadsheet — it’s free, very user friendly and similar to Excel, and it’s what I used to create the enclosed analysis. You can download and install Open Office here. You can do all of the examples in Excel too, but you’ll also need to download an additional add-on (see 4 below)

4. A spreadsheet add-in or macro for principal components analysis. Open Office Calc has a nice one called OOo Statistics which can be download and installed from here. This is the macro I used for the enclosed analysis. If you’re using Excel, you’ll have to find a similar Excel add-in or macro for principal components analysis. There are several commercial and free versions available.

Bad Capacitor C316 on Philips DVP642

I have had Philips DVP642 DVD player for well over 4 years. I bought it because it could play PAL as well as NTSC. Moreover, a simple unlock code entered via remote made it region-free which allowed me to watch DVDs from Europe.

Well eventually it died, and would only blink the red power LED. Fortunately, a quick search for “DVP642 power LED blinking” revealed that it might be easy fixable just by replacing capacitor C316 that has a tendency to go bad in DVP642. I opened the DVD player, found capacitor C316 and indeed it was bulged.

Capacitor C316

Capacitor C316 (click to enlarge)

Bulge on top of capacitor 316 (click to enlarge)

I did not have a spare 1000µF 10V capacitor at home, so I got a replacement from RadioShack. The closest they had was 1000µF 35V for $1.59. Still it was small enough to fit in the place of the old one.

Replacement capacitor soldered in (click to enlarge)

Once the new capacitor was soldered in, I powered the DVD player and it worked perfectly fine.

UPDATE: has been almost half a year and the player still works fine with the replacement capacitor.

DFW airport observation point — “Founders Plaza”

Today I went to the the “Founders Plaza“, which is an observation point for the DWF airport. It has recently been reopened and is pretty nice. There are picnic tables, nice grass, shade, so you can spend hours watching airplanes. They even have telescopes, which are free to use, and a loudspeaker that broadcasts control tower communications (live).

This google map shows the location of the Founders Plaza (marked A) relative to the runways.

I was there for about 1 1/2 hour. Most of the airplanes landing were from American Airlines. But I was also lucky to see AN-124 from Russia (one of the 10 owned by Volga-Dniepr airlines) and a British Airways Boeing 777.

Right after the Russian AN-124 landed, one of the landing AA planes which was already almost above the landing strip, had to abort landing and take off again. It was weird to see it suddenly accelerate and retract the landing gear. Maybe the Russian plane was too slow and had not left the runway yet (I am guessing, I should have been listening to the control tower communications.)

Russian AN-124 landing at DFW (9/28/08)

Russian AN-124 landing at DFW (9/28/08) - click to enlarge

British Airways airplane landing at DFW (9/28/08)

British Airways Boeing 777 landing at DFW (9/28/08) -- click to enlarge

UPDATE: Recently, I have been at the observation point again. This time a bit late, just before sunset. The wind was from the north, so the planes were taking off in front of the observation point. I had a tele lens this time (300mm with 1.4 converter, instead of 50mm). It was a bit too dark to take good photos.

American Airlines jet taking off at sunset (click to enlarge)

American Airlines jet taking off at sunset (click to enlarge)

Parked UPS jet (click to enlarge)

Parked UPS jet (click to enlarge)

Dell i486 33MHz for only $18,000

I have found this Dell ad in an old issue of Byte magazine from Sept. 1990.

Dell i486 25/33MHz

Dell i486 25/33MHz (click to read)

The prices range from $6,399 to $10,799 which corresponds to about $10k to $18k in 2008 dollars. The great features available for that price in 1990 were 4MB of RAM, 800×600 graphic card, 80MB to 650MB hard disk, and a 5.25″ 1.2MB floppy drive.

FPUs compared — 1994 and 10 years later

A few days ago I pulled my NeuroControl Workbench (software I wrote for my M.Sc. project ) off some old dusty backup CDs. I found this screenshot of Neural Net training.

Training ANN on Pentium 100MHz

ANN on Pentium 100MHz (47640 conn/sec)

I was curious how fast the same training would be on my current computer, which is already a few years old and has Athlon 64 San Diego 4000+. Here are the results. (BTW, I was surprised that I was able to run an old MS-DOS program with graphic output and overlay memory management without too much difficulty under Windows XP.)

Training ANN on Athlon 64 4000+

ANN on Athlon 64 4000+ (9172000 conn/sec)

So the difference in speed is 47640 connections/second vs. 9172000 conn/sec — this is over 190 times faster.

Pentium 100MHz, which as far as I remember is the CPU used for the old run, was released in March 1994. Athlon 64 4000+ was released in October 2004. So we have a time difference of about 10 years and speed difference of about 200. Thus the speed of FPUs over that period doubled on average about every 15 months.

(The speed increase is certainly also due to larger on-CPU caches. Pentium 100MHz has only 16KB L1 cache, while Athlon 64 4000+ has 128KB L1 and 1MB L2 cache. The whole NCWB application can fit into the L2 cache on the Athlon CPU.)

BTW, I initially wrote the software on a computer with 387 Cyrix FPU coprocessor. The ANN training speed in conn/sec was about 8000. This was already after I optimized most of the ANN computation by doing all multiplications in inlined assembler. FPUs are, or at least 387 was, very easy to program in assembler because of their stack architecture. Here is a sample of the inlined assembler code that I used:

void Neurode::sumUpError(Neurode far *nextLayer)
  asm push   ds // nextLayer
  asm fldz   // err := 0
             // is summed on the bottom of the stack
  asm les    bx,this
  asm mov    cx,WORD PTR es:[bx].(Neurode)numOutgoing // iteration counter
  asm jcxz   finish // finish if count==0
  // calculate offset into nextLayer.ingoing array
  // corresponding to the the connection of this node with next layer
  asm mov    ax,WORD PTR es:[bx].(Neurode)posInLayer
  asm mov    si,sizeConnection // multiply AX (posInLayer) with sizeConnection
  asm mul    si
  asm mov    si,ax // SI is index into connection array
  asm lds    di, DWORD PTR nextLayer // get pointer to next layer
dosum:// loop over outgoing
  asm fld    DWORD PTR ds:[di].(Neurode)err // load nextLayer[i].err
  asm les    bx, DWORD PTR ds:[di].(Neurode)ingoing // get pointer to nextLayer[i].ingoing
  asm fmul   DWORD PTR es:[bx+si].(Connection)weight // multiply by connecting weight
  asm faddp  ST(1),ST(0) // add err*weight to error sum, pop it
  asm add    di,sizeNeurode // increment index into nextLayer array
  asm loop   dosum // repeat loop if not finished
  asm les    bx,this
  asm fstp   DWORD PTR es:[bx].(Neurode)err // summed-up error is returned
  asm pop    ds

My NeuroControl Worbench can output screenshots in PCX format. I was surprised that neither PhotoShop Elements nor Gimp was able to read these files correctly. Finally, I found that IrfanView was able to open and convert these PCX files correctly.

My Publications

I have put all my publications (incl PDFs) on My Publications page. I hope this will make it easy, for whoever might be interested in them, to actually read them. Personally, I do not like to find some interesting paper, only to discover that I have to drive 30 miles across Metroplex to get it at a library, or have to pay $20 to get it from some journal website.

Back on the web

I haven’t really had a website since my time at university, which, including my post-doc, is getting to be almost 8 years ago. I guess it is time to establish some web presence again. I will be starting by making my university publications available as PDFs.

BTW, my first website was on Geocities, back when it started in 1995/96. That is when Pentium 100Mhz was a fast processor.