Saturday, August 18, 2012

Dock appearing when running VMWare Fusion in fullscreen

In OSX Lion, you can make the dock appear in full screen apps by moving your mouse to the bottom of the screen and then moving it down against the bottom twice.  This can be annoying in VMWare Fusion as you quite often can do this accidentally when trying to switch tasks in the task bar.  Paste this into the Terminal to prevent this behaviour:


defaults write com.apple.Dock autohide-delay -float 10 && killall Dock

Thursday, August 9, 2012

How much memory does a C# string take up?

I've seen various answers on the web and they're mostly wrong or make wrong assumptions (this page for example assumes the overhead is 20 bytes based on only a couple of tests).  The answer is a bit more complex but actually makes a lot of sense when you investigate what happens in the .NET runtime.

Let's assume a 32bit system.

A C# string is a reference type.  Every reference type has an 8 byte header.  The first 4 bytes are used for the lockbits (to support the C# lock statement).  The second 4 bytes are a pointer to the object type.  The object type in turn contains the object vtable.  In reality not all the bits are used in the header are needed for either of these fields since.  For example, the object type pointer is aligned to a 4 byte boundary so the lower two bits can be ignored and reused by the garbage collector for marking the object in its mark-and-sweep cycle.  That's 8 bytes minimum to just have an empty object (System.Object is 8 bytes). 

X = 8 + ...

C# strings store their length.  The length is a 4 byte integer (giving a maximum theoretical string length of 2^32).

X = 8 + 4 + ...

To speed up marshalling to native code, all .NET strings are additionally NULL terminated with a unicode null terminator.  Without this NULL terminator, all strings passed to Win32 APIs would need to be copied.  With the NULL terminator, API calls that take unicode strings can simply be given a pointer to the .NET string (after the string is pinned).  That's 2 bytes.

X = 8 + 4 + 2 + ...

Then you need to store the characters.  Each .NET char takes 2 bytes.

X = 8 + 4 + 2 + (2 * LEN)
But that's not the whole story.  The .NET garbage collector allocates memory with 32 bit alignment .  In other words, the total amount of memory allocated at a time will always be multiple of 4 (4, 8, 16, 20, 24, 32, 36 etc).  This theoretically means that every reference type can be referenced in .NET with 30 bits rather than 32 bits.  Every field that is a reference type in .NET is aligned.  Read more about data alignment here: http://en.wikipedia.org/wiki/Data_structure_alignment

So the final answer is:

X = (8 + 4 + 2 + (2 * LEN)) + 4 - 1) / 4 * 4

In .NET prior to version 4, .NET strings had an extra field named "m_arrayLength" which was never used.  This made strings at least 4 bytes longer.  This field was removed in 4.0.

Did you know that in Java, there is a buffer pointer (in C# the buffer comes straight after the string length) and an offset field used to store an offset within the string buffer.  This allows the java.lang.String.substring(int, int) method to operate on O(1) time rather than O(n) time like .NET.  The new string returned simply points to the origin string's buffer and the offset is taken into account with all operations.  This has the unfortunate side-effect whereby a string of 1 character that is the result of a substring call can take up 1MB of memory because its originator string was 1MB.

Friday, July 20, 2012

Why do Android animations stutter when iOS animations are so smooth?

The underlying issue is the different animation models on Android and iOS.  iOS uses CoreAnimation, an API created by the iPhone team for the original iPhone that was back ported to the desktop OSX.  CoreAnimation is a retained mode graphics strategy.  Microsoft WP7 also uses retained mode.  Google's Android uses what is known as immediate mode graphics.

All GUIs generally work the same way.  There is a main thread with a loop that processes messages from a queue.  Messages can range from "move view to this location" or "user has performed a touch at location".  The whole point is that it is a queue so every message generally gets processed one at a time and in a first come first serve order.

For the majority of UI toolkits, including those found on iOS and Android, accessing and modifying objects must be done in the main thread.  Despite sometimes being called the UI thread, it is usually also the main thread and often is responsible for not just painting, changing colours, moving objects but also for loading files, decoding images, handling network responses etc.

In Android, if you want to animate an object and make it move an object from location1 to location2, the animation API figures out the intermediate locations (tweening) and then queues onto the main thread the appropriate move operations at the appropriate times using a timer.  This works fine except that the main thread is usually used for many other things -- painting, opening files, responding to user inputs etc.  A queued timer can often be delayed. Well written programs will always try to do as many operations as possible in background (non main) threads however you can't always avoid using the main thread.  Operations that require you to operate on a UI object always have to be done on the main thread.  Also, many APIs will funnel operations back to the main thread as a form of thread-safety. It is usually almost impossible to keep all operations on the main thread down to 1/60th of a second in order to allow animations to be processed smoothly.  Even if Google could manage to get their code to do just that, it doesn't mean third party Application writers will be able to.

In iOS operations on UI objects also must be done on the main thread with the exception of animation operations done via CoreAnimation.  CoreAnimation runs on a background thread and is able to directly manipulate, move, recolor and reshape UI objects on a background (CoreAnimation) thread.  Compositing, rendering is also performed in this thread.  It does this through a combination of hardware and software, providing very smooth and fast animations.  From the main thread you can basically issue a call to CallAnimation and tell it to move object1 from location1 to location2.  This animation will continue to run even if the main thread is blocked performing another operation.  This is why animations will almost never stutter on iOS.

Think of the iOS model this way: The main thread manages application data and UI application state (UI application state includes things such as the strings to be displayed in a ListView etc) but issues physical UI state change requests to a separate and dedicated high priority CoreAnimation thread (physical states include things such as colour, position and shape). All physical state changes can be animated and CoreAnimation will also perform the tweening for you (like the Android animation APIs).  Non animated physical state changes will be issued directly by CoreAnimation  and the main thread (not the CoreAnimation thread) will block until those are performed.  Animated physical state changes that are issued by the main thread will be performed asynchronously by the CoreAnimation thread. Because physical UI state and only physical UI state is managed by the CoreAnimation thread, the main thread can be blocked or busy but the CoreAnimation thread will still continue to not only accurately render the last known state of the UI (as issued by the main thread) but also continue to render any pending or incomplete animated UI physical state changes as requested by the main thread.

In Windows Vista, Microsoft introduced desktop composition whereby the OS maintained a separate pixel buffer for every window.  This meant that even if an application hung, the last state of the window (how it looked) is still rendered rather than just being drawn as white (the OS partially managed the state of the pixels in the window).  CoreAnimation goes beyond this and offloads much of the UI work traditionally managed by the main thread including managing the state of not just the pixels (like Vista) but of higher level 
concepts such as widgets, widget locations, widget colours etc.  

iOS and Android use completely different software architectures for performing animations.  Apple probably put more focus on creating something like CoreAnimation because Apple are a lot more OCD about design and user experience than most software companies.  I'm sure Steve would of had a few words with the iOS architects if scrolling stuttered when reading an email because an image on the email needed to be loaded.  Humans aren't computers.  Often the perception of performance is more important than the actual timed performance.  An email that takes 50ms longer to load won't be as noticable as a a touch screen that doesn't instantly respond and move when a user slides their finger over it.

There is nothing too wrong with the Android animation model.  It's the way many toolkits work including Flash which was definitely very animation heavy.  I would say the iOS model makes the overall user experience nicer and offloads one more worry for the developer back to the operating system.  I'm sure Google will continue to recognise the importance of animation on touch screen devices and continue to accelerate (or rearchitecture) Android in coming releases.

A 5 year old 1st generation iPhone will perform smoother and more reliable animations than the latest quad core Samsung Android phone. It's a software design problem and not something you can throw more cores at (not least of which because the main thread will only ever run on one core!). Don't believe people when they excuse stutter and lag as "oh just the Android Java garbage collector".  Modern compactng, generational garbage collectors generally aren't the cause of the kind of stutter you see on Android.

For the moment, you will never see something as simple as a loading wheel stutter in iOS. I hope this explains why :-)

Saturday, June 2, 2012

Audjustable AudioPlayer/AudioStreamer component for iOS

I've written a new AudioPlayer/AudioStreamer for iOS.  It's improved upon Matt Mallagher's AudioStreamer by adding queueing, gapless playback, pre-buffering and a simple but extensible data source model (it works with more than HTTP).

You can find it at Google Code:

http://code.google.com/p/audjustable

Wednesday, October 19, 2011

Playing back VC-1 (WVC-1) videos with DXVA in Windows Media Centre 7

After a lot of messing around with codecs, I've successfully gotten VC-1 videos to playback using DXVA hardware acceleration in Windows Media Centre on Windows 7. CPU usage is 8%-15% on a 2.13Ghz Core 2 Duo. This is down from around 80%-90%. GPU usage is around 10% on an NVidia GT440.

Don't bother with the Microsoft DMO codecs for VC-1 as they don't use DXVA (at least they don't with my NVidia GT440) and use more CPU time than software based ffdshow.

The ffdshow DXVA enabled codec for VC-1 seems to crash on the VC-1 streams I tried.

Solution:
Download Media Player Classic MPHC Stand Alone filters from here. Make sure you you download the 64bit version if you're using 64bit Windows Media Centre.
Extract MPCVideoDec.ax from the zip file to a safe location and then register it from the command prompt using "regsvr32 MPCVideoDec.ax".
Use Win7DSFilterTweaker and select "MPCVideoDec" under xx-bit decoders/VC-1.

Everything should work straight after that (no reboot required). If you encounter any problems then you can use the DSFilterTweaker to switch back to the Microsoft decoder and "regsvr32 /u MPCVideoDec.ax" to deregister the MPC codec.

Wednesday, December 1, 2010

mflow beta

We're beta testing our new web based product offering at beta.mflow.com.

Give it a go! :-)

Try listening to my music recommendations!

Friday, November 26, 2010

Gah, browser bugs

After an hour of debugging I tracked down a bug due to how Firefox handles Ajax calls.

All Ajax calls generally come with a X-Requested-With: XMLHttpRequest header to indicate that a call is an Ajax call.

If the response of the call is a 302 (redirect), then browsers will make an additional call to the new URL but in FireFox, this second call does not contain the X-Requested-With header.