Skype Journal

Home - Contact Us - Policies - Advertise - About News feed Independently covering the Talk Revolution since 2003

Tuesday, February 3, 2009

SILK: Skype's New Audio Codec Sets New Performance Standards for Voice Conversations

The most recent hotfix release of Skype for Windows 4 Beta 3 had one key new feature:

  • feature: Super Wideband audio codec

The associated Skype Garage post went on to say:

... Starting from this version we've included the new Super Wideband Audio codec. This is our second in-house built audio codec especially designed for calls over the internet with superb quality. The Super Wideband Audio codec will help you most on lousy network conditions and when you have lower bandwidth available, although it also improves quality in normal conditions too.
Today Skype for Windows 4.0 Gold release will now allow the entire Skype for Windows user community to take advantage of the SILK codec's features.

SILK is basically a significant improvement on Skype's previously acclaimed HD Voice performance. I have now experienced a couple of calls where this SILK codec was available at both ends of the call; it certainly provides a clearer, crisper audio experience. (For those unfamiliar with the term "codec" they are algorithms engineered into the voice communications network for converting audio waveforms into digital streams for transmission over the communications network and then converting them back to an audio waveform at the receiving end.)

Last week I had the opportunity to interview Jonathan Christensen, Skype's GM for Media Platform to learn more details about this "SILK" codec. This codec is the outcome of a three year development process with a focus on:
  • improving the audio bandwidth out to 12,000 KHz
  • providing bandwidth management to deal in real time with degraded network conditions
  • balancing the codec optimization between voice, music and background noise, each of which can have an impact on the overall user experience
  • overall robustness to provide a more consistent user experience, regardless of network conditions and an individual caller's voice signature.
While the human ear can hear sounds up to 22 KHz the actual sound produced by human vocal chords has a frequency range of 20 Hz to 14 KHz; however, sounds below 70Hz are not what you would call "pleasant" (as experienced with those "thump, thump" car speakers). Skype's SILK codec is optimized for the transmission of audio between 70 Hz and 12 KHz. Compare this to the bandwidth of the PSTN's standard G711 codec of 400 Hz to 3.4KHz; wider band codecs, such as AMR-WB and iSAC cover the range of 50 Hz to 7 or 8 KHz respectively. And, as indicated in both the AMR-WB and iSAC Wikipedia entries, there is a major licensing cost consideration:

AMR-WB has been standardized by a mobile phone manufacturer consortium for future usage in networks such as UMTS. Although its speech quality (similar to Skype, including glitches) makes it likely that older networks will have to gradually be transformed to support wide band, its high legal costs may limit its uptake.

However, in order to deliver on this audio bandwidth, Skype also had to consider getting the voice stream across the Internet. SILK interacts with Skype's redeveloped (network) bandwidth manager that uses a feedback algorithm to provide "adaptive bandwidth management". SILK is a "variable bitrate" codec that can scale the bitrate (amount of data being transmitted as voice packets) up and down as necessary. The key network parameters governing this adaptation are packet loss and jitter changes. Fundamentally, to the end user, this means incorporating a level of call robustness that results in improved consistency of call quality, especially for lower speed Internet connections (below 3Mbps) with no user intervention required.

Another factor to be considered are accommodations for differences in perception of audio quality depending on whether there is voice, music or random background noise involved in the audio signal. Suffice it to say that Skype's engineers have been involved in a balancing act amongst these factors in the development of the SILK codec.

The bottom line is that Skype has set new barriers for voice call quality and and the associated user experience. Since there needs to be SILK at both ends of a call, the number of calls I have experienced with SILK has been limited but, as mentioned above, those I have made had a very crisp, clear audio quality. With Skype's launch today of Skype for Windows 4 Gold release almost all my Skype-to-Skype calls will be able to achieve this performance level. Going forward expect to see SILK incorporated into Skype for Mac in the near future. But the the SILK codec has been modularly designed for embedding into silicon; we can expect future Skype-enabled hardware platforms to be able to take advantage of SILK's performance.

And finally note that, in order to keep costs low while improving call quality, Skype has no licensing costs associated with their proprietary codec. Is there a potential for a new Skype revenue stream by licensing this codec to other communications service providers as well as hardware vendors?

Powered by Qumana

Labels: , , , , , , , , , ,

5 Comments:

At February 16, 2009 4:28 AM , Blogger subhasis said...

How to detect it?

 
At February 16, 2009 4:29 AM , Blogger subhasis said...

I want to block skype voice.How to detect this codec?

 
At March 4, 2009 7:00 AM , Anonymous Anonymous said...

@subhasis do you work for china?

 
At March 4, 2009 5:22 PM , Blogger DG said...

Any data on bit rate verses quality of experience using a formal measure? or range of bit rates? Or even high level detail on the coding techniques being used (CELP or ? verses perceptual coding?)

 
At May 5, 2009 4:12 AM , Blogger endryha said...

Hi all! Where I can find more technical characteristics about SILK?

 

Post a Comment

We've started to moderate comments to avoid spam. Please excuse the short delay. We'll get your post online a quickly as possible.

Links to this post:

Create a Link

<< Home