HiDPI support coming to Flash Player

The current state of Flash content on the new MacBook Pro Retina (HiDPI) display is certainly a little underwhelming compared to HTML content in Safari. I thought I’d share with you that we are actively working to support the new displays. What exactly that means in the end is still work in progress. But to illustrate here are a couple of comparison screenshots which show what we are working on:

flex

The second screenshot shows YouTube (other major video sites like Vimeo work great too):

youtube

What can you expect for your content? Here is the current state of affairs:

- Usually no changes to your SWFs are required. If you use vector graphics they will be transparently scaled up.
- Bitmap filters scale up transparently as do most other effects if your display list object is on the main display list.
- On the ActionScript side Flash pixel units remain in the unit space, i.e. an ActionScript pixel (or point in OSX) is actually 2×2 screen pixels on a Retina display. This works similarly of how OSX approaches this with CoreGraphics and CoreAnimation.
- You will be able to access a new property: Stage.contentsScaleFactor. This property tells you if the Stage is on a HiDPI display or not. You could for instance use this property to load higher resolution bitmaps if you’d like.
- Stage3D will work slightly differently. As a Retina display quadruples the number of pixels it could result in a huge performance loss. And of course your code might be highly dependent on actual back buffer sizes (for instance if you do render to texture, pixel effects and such). So the Flash Player will not change the back buffer size by default. To enable HiDPI displays we will add a new flag to Context3D.configureBackBuffer (the exact API is not fleshed out yet) which will allow you to buy into a higher resolution back buffer. Most 2D frameworks based on Stage3D will probably want to do this. Again this approach is similar to what is done on OSX, see documentation around setWantsBestResolutionOpenGLSurface.

Of course we expect a few issues here and there but from what we have tested so far most sites work great in this new mode, including big franchises like FarmVille and Angry Birds.

Flash Player 11.2 graphical gem

At MAX2011 I had a session talking about Flash Player futures and what we have in store to keep the needle moving. A lot of things have made it in including a small graphical gem I’d like to reveal today. It’s primary focus is not so much targeting the web but those few cases where designers use the Flash Player for offline graphics production work which remains surprisingly common.

Flash Player has seen improvements in the graphics department over the last few years but we have not really addressed basic vector rendering that much. That’s rather visible when look around on the web today. A lot of graphic assets are created using Flash Professional and then either exported as bitmaps to be used in other workflows or of course as SWFs. The rendering of vector graphics in the Flash run time has a distinct pattern due to the way anti-aliasing is computed.

Until today there was a ceiling for what you could accomplish quality unless you went through some hoops. These limitations in the Flash Player rasterization engine have been in place because the target was and is real-time rendering of vector graphics. The run-time still does this pretty well, even today with all the GPU and HTML5 Canvas tag brouhaha.

Anyhow, we can do much better quality wise if you do not care about performance that much. So why not offer the option to crank up the quality level? Flash Player 11.2 has a couple new stage quality settings which try to help. The specific improvements for these modes can be split into these categories:

- better anti-aliasing using 16×16 oversampling per pixel instead of the classic 4×4.
- linear blending for anti-aliased pixels to avoid halos when certain colors are used. It also visibly improves the pixel stepping artifacts of thin curves and lines.
- dithered gradient rendering to avoid banding with gradients which have small luma changes.
- better interpolation between gradients stops to avoid banding.

It’s fairly simple to enable these new modes in Flash Player 11.2. Simply set your stage quality in AS3 or AS2 to “16X16″ or “16x16LINEAR”. Yes, it works for both AS2 and AS3 movies. In AS2 simply add this action to your SWF:

_root._quality = "16X16LINEAR"; // or "16X16"

In AS3 add this to your actions:

stage.quality = "16X16LINEAR"; // or "16X16"

You might want to export to SWF version 15 or newer as some internal limits on bitmap sizes have been lifted and affect this feature if you have a stage larger than 1024×1024. For the 16×16 oversampling to work all internal coordinates need to be scaled 4x.

Here is a matrix which shows the current rendering on the left column and the two new modes on middle and right column:

Now the question for you will be if you will really see a difference with your content. That’s a good question. If you can spot the differences in the above image instantly you are on a good track to realize improvements in your own designs. If not, rethink using this feature since the impact on performance is really severe.

On that note: These modes are not designed for real time uses case. Please do not blindly enable these modes for web content. That is not the point. These modes are VERY slow and designed to improve quality, not performance. The only use case where it would make sense to use these modes on the web would be for instance to pre-render sprite sheets for use with Stage3D. We also have a new feature lined up in the next version of the Flash Player which will extend the BitmapData.draw() to take an extra parameter to specify the quality without having to change the global stage quality. We know that this an important feature for games which want to run in LOW quality but render assets in HIGH quality.

Technical note: While 16X16LINEAR will oversample fractionally covered pixels in linear sRGB space, the linear space calculations do not apply to general compositing operations. I did experiment with having the entire rasterization in linear sRGB space but it turns out to be more of a liability than it helps for existing content as anything with transparency will change so visibly in appearance that designers would have to make adjustments. Designers already make manual adjustment to handle non-linear color space artifacts in their design so limiting the linear space calculations to those parts the designer can’t control, like in this case anti-aliasing, make more sense and generally keep the intent of the design while improving the rest of the image. Going forward it will of course be great to only operate in linear sRGB space and offer tooling around this. This gets complicated quickly workflow wise though because as soon as you try to communicate why anything else than Flash, including other Adobe tools and any sort of common file format, you end up with standard non-linear sRGB. You’ll see part of this problem if you use the new 16x16LINEAR mode and export the result to a PNG or something like that you want to reuse. If there is any sort of transparency weird results can ensue. In that case you would want to stick with 16X16.

Why Starling (or any other 2D framework on top of Stage3D)?

Let me try to answer a pressing question from the community: Why did Adobe not accelerate the classic display list APIs to support the GPU instead of inventing a new API called Starling?

Well, we have done that (accelerating the classic display list APIs that is) and we have learned a lot from it. In fact we did it twice with two completely different approaches:

Approach #1: Back in Flash Player 10 (early 2008) we introduced ‘wmode=gpu’ which accelerated compositing of display list objects using the GPU. It did so by pre-rendering display list objects on the CPU and uploading them as textures to the GPU where they are then composited using the GPU. It worked in some cases but in the end we discovered that only a handful of sites were using this mode as no one could figure how to create faster content. Worse, in some cases it looked like this was enabled by accident as the site was running much faster in non-GPU mode. Designing content for GPUs is non-obvious as I will outline below. Because of these reasons and because GPU code is generally very expensive to maintain for Adobe we decided to pull that rendering mode from Flash Player 10.3 and let it fall back to ‘wmode=direct’ mode.

Approach #2: On mobile, which includes Android and iOS, we have ‘renderMode=gpu’. Unlike the ‘wmode=gpu’ on the desktop this mode renders vector graphics directly in the GPU using OpenGLES2. This mode is still available to you today on Android and iOS and we see some content using it. Content which is using ‘renderMode=gpu’ successfully sticks to a very small subset of the classic display list APIs which looks eerily close to the subset Starling provides. And yet there is a higher cost overall in the Flash Player than if you would just be using Starling due to the many layers involved to emulate some classic display list features. In short: You are likely better off using Starling going forward for new content.

So what is the problem with using the classic display list APIs? The essence is that the classic display list APIs were designed for a software renderer. It does not easily scale to be pushed to a GPU for faster rendering.

- The classic display list has many legacy features which are tied to the specific way our software rasterizer works. That includes vectors masks and scale-9 for instance. You will see that with Starling you will have to find a different way to get the same effects.

- A lot of other classic display list features can not be easily expressed on a GPU without going through slow and complex code paths and more importantly loss of fidelity. That includes blend modes, filters, some forms of transformations, device text among many others. In some of those cases we have to fall back to software. That makes creating well performing SWF content difficult to say the least. You need to exactly understand what happens under the hood of the Flash Player to get well performing content. Documenting the exact behavior of the Flash Player without access to the actual Flash Player code is very difficult as there are simply too many special cases. That documentation could be nothing more than the actual Flash Player code. And reading a large C++ code base might not be your thing either. ;-)

- GPUs like flat display hierarchies. Deeply nested MovieClips are a big no no. You might think this could be easily optimized behind the scenes. I can tell you that without hints about the original application data structure layout that this is not possible. It’s the classic problem where each additional abstract API layer in an application introduces more entropy and at the end you are unable to figure out the original intent of the application which you need to apply meaningful optimizations. I see too much content where excessive use of nested MovieClips makes it impossible to figure out what the content is actually doing on the screen.

Le me put this into an analogy you might be able to understand better: Let’s say the Flash Player would have no APIs to draw strings or text, only APIs to draw individual characters. Drawing strings would be implemented by some AS3 code. OK fine, but actually drawing individual characters is 10x slower than drawing complete strings for the internal Flash Player code. That means that the Flash Player would have to reverse guess what the string/text was which is expensive and sometimes not possible.

- GPUs like bitmaps. Rendering vectors either has to be done on the CPU which means you incur texture upload costs for each frame or will create a lot of vertex data which is a problem on mobile GPUs (and Intel desktop GPUs ;-) . Rendering gradients has its own challenges as pre-rendering a radial gradient into a bitmap can be faster than using pixel shader code on most GPUs. This seems counter-intuitive but makes sense if you realize that texture fetches are implemented in a dedicated part of the silicon vs. a pixel shader which has to be run in the ALU.

- Mouse events are implemented with perfect hit testing in the classic display list API, i.e. it is based on the actual vector graphics shapes. If you have a circular vector shape as a button a mouse click will not activate that button unless it is within that circle. This makes sense on the desktop where you have a precise mouse cursor but is extremely wasteful on mobile where you really want to deal with simple large rectangular touch areas. Each additional computation cycle for detecting mouse hits increases the perceived lag of a SWF. What’s worse is that if you want to express large touch areas which extend over the graphic representation of the button you would do this by adding another MovieClip with a transparent vector rectangle to the display list which further impacts overall performance.

- The classic display list API is a giant state machine which needs to be evaluated for every frame. Just x/y translating an object can trigger expensive recalculations and re-rendering without you knowing it. The classic example here is cacheAsBitmap which is probably the most misunderstood and misused feature in the Flash runtime. With Starling the state changes from frame to frame are not hidden but plainly visible in ActionScript which means you have a chance to see what is actually going on.

I could go on and on, but I hope this answers some questions of why we are offering Starling.

Long term I hope that most games and multimedia content will move to Stage3D and use the classic display list for what it’s really good at which is to create high fidelity vector graphics on the fly, rendering text, pixel processing any many others. It certainly won’t go away and we will continue to add features and optimize performance. If you have fixed graphics assets it is usually better to bring these in externally as bitmaps and stick with Stage3D.

I strongly believe that with Stage3D and Starling we are way ahead compared to other web technologies who still have to go through the same learning experience we went through over the last 4 or so years.

LZMA

Yup, the current Adobe Flash Player Incubator builds have more stuff to talk about. We are consistently looking at keeping the Flash run time up to date with current technologies. And as I said in my previous post bandwidth and storage space still matter for most developers and users. So going forward are looking at extending the current compression mechanism for SWF files which is based on zlib today to support LZMA compression also. We have been using LZMA for several years now in our installers. And like for JPEG XR, LZMA became a requirement for Molehill textures. The step support it for SWFs was a small one from here.

LZMA was originally created by Igor Pavlov who has since then placed the entire code base into the public domain. It beats zlib compression by a wide margin. Though it is much better the incremental improvements very much depend on the type of content you have in your SWF file. In general the more ActionScript code and vector graphics (and now 3D meshes) you have in your SWF the better the improvements will be. It will not do much better if your SWFs mainly consist of bitmaps. Let me put out some numbers here:

Igor Pavlov
zlib LZMA Gain/Loss
Mandreel NyxQuest 14,296,699 bytes 8,892,786 bytes -37.8%
Flex 4.5 RSLs (combined) 3,091,229 bytes 2,357,002 bytes -23.8%
CopperCube Molehill Backyard Demo 5,530,215 bytes 4,890,509 bytes -11.6%
Away3D LoaderOBJTest 3,921,994 bytes 3,754,181 bytes -4.3%
Mario Forever Flash 2,779,142 bytes 2,735,930 bytes -1.6%

What about ByteArray.compress()/ByteArray.uncompress() support for LZMA? Not yet, that is a little more complicated.

So how can you try this out? Well, this is so fresh we don’t have tools available yet. Though you could whip up a python script which does that in about 10 lines of code. Since I am not a python buff, I whipped up a few lines of C++ code to do this instead :-) Feel free to try this out if you are adventurous and let me know if it makes a difference for you.

And as a reminder: all these features which are part of Incubator are experimental and not committed any final release, we are are really only looking for early customer feedback.

[Update: Looks like Burak Kalayci and Joa Ebert already found a bug in my sample code. This is if course the reason we have Incubator builds: to catch these things early. I'll be updating this code when the next Incubator build goes out. Keep the feedback coming!]

JPEG XR

The Flash Player Incubator builds on labs.adobe.com contain a capability which is the result of some of the Molehill work I have done. When we looked at how to support textures for Molehill I immediately faced the issue of having to deal with inadequate image formats.

Traditional JPEGs are simply not flexible enough. They lack transparency support and suffer from low quality. Most encoders use 4:2:0 chroma sampling and the most commonly used implementation is based on outdated reference source code which does not leverage modern compression techniques. PNGs tend to have their own share of issues, specifically they tend to be rather bad for photographic and noisy content.

JPEG XR to the rescue. JPEG XR is a completely free specification, provided by Microsoft with the Microsoft Open Specification Promise. It has been standardized as ITU T.283 and comes with fairly good reference source code. Why is JPEG XR so cool?

  • It supports lossy compression of alpha channels.
  • It supports true lossless compression when needed.
  • It has better quality than JPEG in most cases.
  • Computationally and complexity wise it is much lighter weight than JPEG-2000.
  • It supports floating point formats.
  • It only supports one color space, the only one which matters on the web: sRGB (in video terms that is BT.709).
  • It supports RGB555 and RGB565 color models which is important for block compressed Molehill textures.
  • It’s completely open as mentioned above.
  • etc.

Particularly the chroma sampling issues with most available JPEG encoders are a pretty big problem for me and certainly for many designers. Let me show what I mean, here is an image compressed with 4:2:0 on the left and one with 4:4:4 on the right, notice the washed out red (Did you know that in Fireworks 4:4:4 chroma sampling kicks in only around 86 in the JPEG compression quality settings?):

So what does that mean for the Flash Player? Though I added JPEG XR primarily for Molehill texture support you can load any JPEG XR image like you load PNG or JPEG files in the Flash Player today (if you mark your SWF as version 13).

IE9 also has native JPEG XR support. Here is a nice comparison of the quality improvements you can get with JPEG XR compared to JPEG.

Bandwidth is still important and reducing the size of assets is particularly key if you do something like games which rely on huge amounts of them. Does that mean that PNGs and JPEGs are obsolete? Not by any means. That’s especially the case for PNG. PNG does extremely well on synthetic/flat content and will beat both JPEG and JPEG XR at any time with that type of content.

2D Sprite Sheets with Molehill

Thibault posted a 2D sprite demo with his announcement. How was this done? Well, it’s rather simple: here is the code for it.

I essentially implemented a new class which works fairly similarly to a Sprite object but called it Sprite3D instead. It supports scale, rotation, alpha and color transforms. All the basic stuff you need. Of course you could easily extend this to support even more things like effects. Anyhow, hopefully looking at this code will give you an idea of how to do 2D graphics with Molehill. It’s much simpler than doing 3D for sure, especially the shaders are much easier to understand.

The sprite sheet itself was created with a combination of tools: SWFSheet and TexturePacker. Both of these are free. Within TexturePacker I exported to the cocos2d plist sprite sheet format which is parsed and managed in the Sprite3D class.

Oh, if you wonder: That ‘thedog.atf’ file is using a new file format we are introducing specifically for texture data. We will make the encoding tools available for this very soon at which point I will explain more of how these work.

Here is what you should see when you run this demonstration:

Molehill from scratch with Adobe Flash CS 5

Flash Player builds with Molehill are now available for everyone to download. Many will want to start playing with this right away and many will probably ask: “What kind of crazy API is this? How do I get started? I don’t get it.”

There are two paths you can take: Either you get one of the available frameworks for Molehill like Away3D or you can take the deep dive and start from scratch. Starting from scratch can be a challenge since the APIs look intimidating at first. Let me give you a head start and provide you a Hello World type Adobe Flash CS 5 project using a famous 3D object: the teapot. From there it should be easier to experiment and explore.

To make it even easier I have added everything you need to get started with Adobe Flash CS 5 into the .zip file. Get the archive here. The most important part is that you need to copy two files into your Adobe CS 5 installation, follow the instruction in the archive.

Once you are all set up and compile the sample code you should end up with this:

Have fun experimenting! Don’t hesitate to ask questions.

H.264 hardware decoding in Mac OS X

ArsTechnica recently asked Adobe if we would use the recently added video acceleration API in Mac OS X 10.6.3. The answer was yes and today we are making it available as a beta version (it was also a feature included in the release candidates builds we have been making available on labs.adobe.com since April 7th 2010, though not turned on; more details at the end).

Background

The primary reason this API exists is because we have been working with Apple to come up with a way to reduce power consumption on Macs. As we see more and more HD content on the web it is critical that we take advantage of hardware resources when available. In that context it is important to understand that this API targets HD content, not SD or smaller sized video. In fact SD sized content will not be accelerated in most cases. The decision of what content is accelerated and on which machine it is supported is up to Apple.

Machine and OS support

The new video acceleration API is only available in Mac OS X 10.6.3 or later and is limited to GPUs models such as NVIDIA GeForce 9400M, GeForce 320M or GeForce GT 330M. For more details you can look at Apple’s technote. Here is a list of the Mac models currently supported:

  • MacBooks shipped after January 21st, 2009
  • Mac Minis shipped after March 3rd, 2009
  • MacBook Pros shipped after October 14th, 2008
  • iMacs which shipped after the first quarter of 2009

(Mac Pros are not supported as of today)

How do I know that it is working?

After you install the new beta version of Adobe Flash Player (code named “Gala”) and play a video you will sometimes notice a white rectangle overlaying the video. This is the sign that hardware decoding is currently active.

If the white rectangle is missing, Adobe Flash Player has reverted to software decoding. We will of course remove the white rectangle for the final release.

What are the limitations right now?

  • Some resolutions are not supported. Specifically YouTube does sometimes provide a resolution of 864 * 480 pixels for their 480p content which forces a software fallback.
  • Resolutions smaller than 480 * 320 pixels are not accelerated on NVIDIA GeForce 9400M based Macs. On NVidia GeForce 320M and GeForce GT 330M the threshold can be a bit higher. These choices are picked by Apple and balance power usage of the CPU vs. GPU for their particular hardware. Remember that using the GPU for video decoding does not always result in overall power savings. This is something you can only decide on based on the exact type of hardware combination and the content you are trying to play. Playing video has a fixed baseline cost in the GPU for instance which is not the case when you decode on a CPU.
  • The software decoder in Adobe Flash Player is more forgiving when it comes to improperly encoded video files, it works around many issues. The hardware decoder can not handle some of these cases. You might notice that some videos will have ‘jumpy’ frames, i.e. frames are out of order (we have seen that with some files uploaded to YouTube). This is usually because Composition Time Offsets are not properly set up.
  • The hardware decoder is limited to 2 instances at a time. This limit is total to the system. If you have more than 2 videos open at the time the 3rd one will fall back to software decoding. This is even the case when a video is on a hidden tab (This is another reason that hardware decoding is reserved for high resolutions).
  • In the current release of Mac OS X 10.6.3 hardware accelerated decoding will sometimes stop to function until you restart Safari. We are in process to resolve this issue with Apple. But if you can reproduce this consistently with a specific URL please let us know.

Safari and Performance

Compared to QuickTime based video playback support in Safari 4.0.x on Mac OS X 10.6.3 (or your standalone VLC/QuickTime player that is) there is still room for improvement in Flash Player. We have a good plan on how to proceed, which will allow us to leverage all the hardware resources available to us.

Video playback is generally hardware accelerated on two levels: 1. Decoding H.264 bit streams itself and 2. Displaying & scaling the decoded YUV12 formatted video frames. The new API provided by Apple only covers H.264 decoding and we are well aware that we need to accelerate the display and scaling of video. CAOpenGLLayer is the vehicle for that. We are looking at how we can get this implemented soon, but it’s simply too late to include this into Flash Player 10.1.

Previous release candidates

As some have noticed, previous release candidates we have made available on labs.adobe.com referenced this hardware decoding API provided by Apple. We are not in a position yet to enable this by default (hence the extra beta version we are making available) as this has only seen very limited testing by the engineers. Because of some of the issues I mentioned above, we want to put the hardware acceleration functionality through a full public beta cycle before including it in a final shipping version of Flash Player.

If you decide to install this beta version please let us know if you encounter any issues and file bugs here.

Press any key to continue

The recently released Flash Player 10.1 rc contains a couple of enhancements which are worth of a quick note.

Screen savers and video playback

An annoying behavior in older Flash Player versions is the fact that passively consumed content, video specifically, does not prevent the screen saver from kicking in. After some conversation here internally we think we finally have a good answer how to solve this problem. If all of the following conditions are true the screen saver is prevented from kicking in even if you are not in full screen mode:

1. Video is playing
2. Video is not paused or stopped
3. Video is not buffering
4. Sound is playing
5. Sound has a volume (this makes sure that silent ads do not cause harm)
6. The SWF is currently visible (with some caveats on platforms and browsers, see next paragraph)

So no more tapping the keyboard while you are watching a video!

Determining the visibility of SWFs

In my previous post I have mentioned that we now throttle the player whenever a SWF is not visible. Well, I wish we could make this work consistently. As of today we can not always determine if our instance is visible. There is no standard way of doing this, every browser works slightly different.

Here is the current status:

IE
7/8
Win
Firefox
3.6
Win
Opera
10.1
Win
Safari
4.0.5
Mac
WebKit
nightly
Mac
Firefox
3.6
Mac
Firefox
3.7
Mac
Firefox
3.x
Linux
We know if our SWF instance is on a hidden tab YES YES NO YES YES NO NO NO
We know if our SWF instance is scrolled out of view YES NO NO YES YES NO YES NO

Each time you see a NO we can not throttle SWFs to use less CPU resources. These limitations are dictated by the browser so it will take some time to sort this out.

If you have Flash Player 10.1 rc installed, here is something simple to try out: Go to this page (skip the ad):http://www.nickjr.com/kids-games/ants-adventure.html. Now either put the page on a hidden tab or scroll to the bottom of the page so the SWFs are not visible. In IE on Windows and WebKit nightly on the Mac you should see that the CPU usage drops significantly after a couple of seconds because we throttle the SWFs.

I believe this problem can be easily fixed in Firefox Mac and most of the other browsers going forward. On Linux however this will be much more tricky because of GTK (the framework we have to use). We will probably need a special NPAPI extension and lots of browser changes to make this possible.

Timing it right

Status quo

During the Flash Player 10.1 time frame, I was tasked with taking a look at the timing system we use in the Flash Player. Until now the Flash Player has been using a poll based system. Poll based means that everything which happens in the player is served from a single thread and entry point using a periodic timer which polls the run-time. In pseudo code the top level function in the Flash Player looked like this:

while ( sleep ( 1000/120 milliseconds ) ) {
// Every browser provides a different timer interval
...
if ( timerPending ) { // AS2 Intervals, AS3 Timers
handleTimers();
}
if ( localConnectionPending ) {
handleLocalConnection();
}
if ( videoFrameDue ) {
decodeVideoFrame();
}
if ( audioBufferEmpty ) {
refillAudioBuffer();
}
if ( nextSWFFrameDue ) {
parseSWFFrame();
if ( actionScriptInSWFFrame ) {
executeActionScript();
}
}
if ( needsToUpdateScreen ) {
updateScreen();
}
...
}

The periodic timer is not driven by the Flash Player, it is driven by the browser. In case of Internet Explorer there is an API for this purpose. In the case of Safari on OS X is it hard coded to 50 frames/sec. Every browser implements this slightly differently and things become very complex quickly once you go into details. This has been causing a lot of frustration among designers who could never count on a consistent cross platform behavior.

Another challenging issue with this approach has been that limiting the periodic timer to the SWF frame rate is not acceptable. The problem becomes more obvious when you think of a SWF with a frame rate of let’s say 8 and play back a video inside which runs at 30 frames/sec. To get good video playback you really need to drive the periodic timer at a very high frequency to get good playback otherwise video frames will appear late. In the end the Flash Player always used the highest frequency available on a particular platform and/or browser environment.

The wrong path

The obvious way to re-architect this is to get rid of the polling and instead design an event based system. The new player code would have looked like this, with different subclasses of a Event base class encapsulating what the polling code had done before:

Event e;
while ( e=waitForNextEvent() ) {
e.dispatch();
}

This approach failed miserably:

  • CPU usage turned out to be much higher than expected due to the abstraction involved.
  • In some cases the queue would grow unbounded.
  • The queue needed a prioritization scheme which turned out to be almost impossible to tune properly.
  • Most SWF content out there depends on a certain sequence logic. Out of order events broke the majority of the SWFs out there.

It’s not all bad

Back to the drawing board. This time my focus was on the actual problem: The Flash Player polls up to 120 times second even if nothing is happening. Modifying the original code slightly I came up with this:


while ( sleepuntil( nextEventTime ) OR externalEventOccured() ) {
...
if ( timerPending ) { // AS2 Intervals, AS3 Timers
handleTimers();
nextEventTime = nextTimerTime();
}
if ( localConnectionPending ) {
handleLocalConnection();
nextEventTime = min(nextEventTime , nextLocalConnectionTime());
}
if ( videoFrameDue ) {
decodeVideoFrame();
nextEventTime = min(nextEventTime , nextVideoFrameTime());
}
if ( audioBufferEmpty ) {
refillAudioBuffer();
nextEventTime = min(nextEventTime , nextAudioRebufferTime());
}
if ( nextSWFFrameDue ) {
parseSWFFrame();
if ( actionScriptInSWFFrame ) {
executeActionScript();
}
nextEventTime = min(nextEventTime , nextFrameTime());
}
if ( needsToUpdateScreen ) {
updateScreen();
}
...
}

This approach is solving several problems:

  • There is no abstraction overhead.
  • In most cases it reduces the polling frequency to a fraction.
  • It is fairly backwards compatible.

More importantly, I replaced the browser timer with a cross platform timer which can wait for a particular time code. This not only yields better cross platform behavior, it also allows us to tune it in a way I could not do before. Which leads me to the most important change you will see in Flash Player 10.1: The way we behave when a SWF is not visible.

Implications for user experience

In Flash Player 10.1 SWFs on hidden tabs are limited resource wise. Whereas they would run at full speed in Flash Player 10.0 and before (note though that we NEVER rendered, we only continued to run ActionScript, audio decoding and video decoding), we now throttle the Flash Player when a SWF instance is not visible. Doing this change was not easy as I had to add many exceptions to avoid breaking old content. Here is a list of some of the new rules:

Visible:

  • SWF frame rates are limited and aligned to jiffies, i.e. 60 frames a second. (Note that Flash Playe 10.1 Beta 3 still has an upper limit of 120 which will be changed before the final release)
  • timers (AS2 Interval and AS3 Timers) are limited and aligned to jiffies.
  • local connections are limited and aligned to jiffies. That means a full round trip from one SWF to another will take at least 33 milliseconds. Some reports we get say it can be up to 40ms.
  • video is NOT aligned to jiffies and can play at any frame rate. This increases video playback fidelity.

Invisible:

  • SWF frame rate is clocked down to 2 frames/sec. No rendering occurs unless the SWF becomes visible again.
  • timers (AS2 Interval and AS3 Timers) are clocked down to 2 a second.
  • local connections are clocked down to 2 a second.
  • video is decoded (not rendered or displayed) using idle CPU time only.
  • For backwards compatibility reasons we override the 2 frames/sec frame rate to 8 frames/sec when audio is playing.

This marks a pretty dramatic change from previous Flash Player releases. It’s one of those changes which are painful for designers and developers but are unavoidable for better user experience. Let me show you a CPU usage comparisons with content running on a hidden tab (test URL was this CPU intensive SWF):

Flash Player 10.0

Flash Player 10.1

In this test case the frame rate in the background tab tab has been reduced to 8 frames/sec as audio effects are playing. If there was no audio the improvement would be even more pronounced. The test machine was a Acer Aspire Revo AR1600.

PS: You’ll notice in the two screen shots that the memory usage shows a quite dramatic difference also. That’s for another blog post.