Pages

Monday, July 7, 2014

Vote for "Unmanaged generic type constraint + generic pointers" uservoice idea for .NET

If you care about performance, maybe because you are working with big data, image analysis, computer vision, machine learning etc., please go to Visual Studio user voice and vote for:


Microsoft are taking these user voice ideas serious (very good) and are in fact currently working on C# and SIMD in the form of, at least for now, an out-of-band library Microsoft.Bcl.Simd and the new Just-in-Time compiler RyuJIT.

Getting support for an unmanaged generic type constraint and generic pointers combined with SIMD operations based on unsafe pointers would be absolutely awesome. Although, the System.Numerics.Vector types and methods in the current version 1.0.2-beta of Microsoft.Bcl.Simd are very limited in type support and have no support for creating "vectors" over pointers.

Currently, the lack of unmanaged generic type constraint and generic pointers means that, if you are doing image processing on different primitive types byte, sbyte, short, ushort, int, uint, long, ulong, float, double etc., then you have to repeat code for each of these - if you care about type safety and performance - leading to massive code bloat and combinatorial explosions where I have seen methods with thousands of overloads (using T4 code generation), which Visual Studios intellisense simply can't handle. Getting the above feature would allow a lot of this code to be boiled down to a single generic method e.g. (a draft example using value type trick to get inlining inside execution loop):

    public interface IFunc<in T, out TResult>
    {
        TResult Invoke(T arg);
    }

    public struct Threshold : IFunc<int, byte>
    {
        readonly int m_threshold;

        public Threshold(int threshold)
        {
            m_threshold = threshold;
        }

        // Since this is a value type it will be inlined by JIT in release
        public byte Invoke(int value)
        {
            return value > m_threshold ? byte.MaxValue : byte.MinValue;
        }
    }

    public static class Transforms
    {
        public unsafe static TFunc Transform<T, TResult, TFunc>(
            ArrayPtr2D<T> src, TFunc func, ArrayPtr2D<TResult> dst)
            where T : unmanaged
            where TFunc : struct, IFunc<T, TResult>
        {
            if (src.Size != dst.Size)
            { throw new ArgumentException("Arrays must have same size"); }

            var width = src.Size.Width;
            var height = src.Size.Height;

            var srcStride = src.StrideInBytes;
            T* srcRowPtr = src.DataPtr;
            var srcRowPtrEnd = ((byte*)srcRowPtr) + srcStride * height;

            var dstStride = dst.StrideInBytes;
            TResult* dstRowPtr = dst.DataPtr;

            for (; srcRowPtr != srcRowPtrEnd;
                   srcRowPtr = (T*)(((byte*)srcRowPtr) + srcStride),
                   dstRowPtr = (TResult*)(((byte*)dstRowPtr) + dstStride))
            {
                var srcColPtr = srcRowPtr;
                var srcColPtrEnd = srcColPtr + width;
                var dstColPtr = dstRowPtr;
                for (; srcColPtr != srcColPtrEnd; ++srcColPtr, ++dstColPtr)
                {
                    *dstColPtr = func.Invoke(*srcColPtr);
                }
            }
            return func;
        }
    }

    public class Program
    {
        public static void Main()
        {
            // Initialize from some existing native data i.e. from a bitmap etc.
            ArrayPtr2D<int> src = ...;
            ArrayPtr2D<byte> dst = ...;
            // Unsafe, fast, loop with threshold invoke inlined
            Transforms.Transform(src, new Threshold(1275), dst); 
        }
    }
Note how due to the transform having type 'T' as input and type 'TResult' as output we can handle a lot of combinations with this, without repeating the code. The JIT will, of course, have to generate specific code for each actual type usage etc. However, this is exactly what we want. Optimized code for each specific type combination and value type func. Who wouldn't want that ;)

F# already has the unmanaged generic type constraint as can be seen in Constraints (F#), we just need this in C# and generic pointers to unmanaged types.

So perhaps instead of obsessing about Change All CAPS Menu in VS 2012 to VS Beta format File Edit Instead of FILE EDIT (although I agree ALL CAPS is a terrible design choice) vote for something that we could all enjoy, less code and higher performance ;)



Friday, February 21, 2014

Automatically Embed Copy Local Assemblies with Symbols in MSBuild

Many have written about how to automatically embed assemblies into an executable, such as:
Lots of questions about this on stackoverflow as well, such as:
However, none of these show how pdb-files can be embedded as well to ensure symbols are also loaded when resolving embedded assemblies as detailed in Embedded Assembly Loading with support for Symbols and Portable Class Libraries in C#.

Automatically embed all dll- and pdb-files exclude xml-files

The solution, shown below, is a simple extension of what Daniel Chambers has described, but also includes pdb-files and exclude copying of xml-files to the output directory since many libraries often include these documentation files.
<Target Name="EmbedReferencedAssemblies" AfterTargets="ResolveAssemblyReferences">
  <ItemGroup>
    <!-- get list of assemblies marked as CopyToLocal -->
    <FilesToEmbed Include="@(ReferenceCopyLocalPaths)" 
                  Condition="('%(ReferenceCopyLocalPaths.Extension)' == '.dll' Or '%(ReferenceCopyLocalPaths.Extension)' == '.pdb')" />
    <FilesToExclude Include="@(ReferenceCopyLocalPaths)" 
                  Condition="'%(ReferenceCopyLocalPaths.Extension)' == '.xml'" />

    <!-- add these assemblies to the list of embedded resources -->
    <EmbeddedResource Include="@(FilesToEmbed)">
      <LogicalName>%(FilesToEmbed.DestinationSubDirectory)%(FilesToEmbed.Filename)%(FilesToEmbed.Extension)</LogicalName>
    </EmbeddedResource>

    <!-- no need to copy the assemblies locally anymore -->
    <ReferenceCopyLocalPaths Remove="@(FilesToEmbed)" />
    <ReferenceCopyLocalPaths Remove="@(FilesToExclude)" />
  </ItemGroup>

  <Message Importance="high" Text="Embedding: @(FilesToEmbed->'%(Filename)%(Extension)', ', ')" />
</Target>
To use this simply copy and paste this into the executable project file (e.g. *.csproj) right after:
<Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />

Automatically embed all dll- and pdb-files exclude xml-files and mixed mode assemblies

However, unfortunately as far as I know embedding mixed mode assemblies (e.g. with both managed and native code from for example a C++/CLI project) does not work. So these still have to be copied to the build output. At least, if you do not want to extract the embedded file, as detailed in Single Assembly Deployment of Managed and Unmanaged Code.

One solution to this is to simply exclude these files by adding exclude conditions to the above xml. For example:
<Target Name="EmbedReferencedAssemblies" AfterTargets="ResolveAssemblyReferences">
  <ItemGroup>
    <!-- get list of assemblies marked as CopyToLocal -->
    <FilesToEmbed Include="@(ReferenceCopyLocalPaths)" 
                  Condition="('%(ReferenceCopyLocalPaths.Extension)' == '.dll' Or '%(ReferenceCopyLocalPaths.Extension)' == '.pdb') And '%(Filename)'!='MixedModeAssemblyA' And '%(Filename)'!='MixedModeAssemblyB'" />
    <FilesToExclude Include="@(ReferenceCopyLocalPaths)" 
                  Condition="'%(ReferenceCopyLocalPaths.Extension)' == '.xml'" />

    <!-- add these assemblies to the list of embedded resources -->
    <EmbeddedResource Include="@(FilesToEmbed)">
      <LogicalName>%(FilesToEmbed.DestinationSubDirectory)%(FilesToEmbed.Filename)%(FilesToEmbed.Extension)</LogicalName>
    </EmbeddedResource>

    <!-- no need to copy the assemblies locally anymore -->
    <ReferenceCopyLocalPaths Remove="@(FilesToEmbed)" />
    <ReferenceCopyLocalPaths Remove="@(FilesToExclude)" />
  </ItemGroup>

  <Message Importance="high" Text="Embedding: @(FilesToEmbed->'%(Filename)%(Extension)', ', ')" />
</Target>
I would love to have a solution that actually checks whether an assembly is mixed mode (i.e. not pure) before embedding it. Or at least create a list of assembly names to exclude instead of the crude condition hack above.

One could also imagine checking the path of the assembly and whether this has AnyCPU, x86, x64 in the path or similarly as a convention for embedding or not embedding the given assembly. Lots of other improvements should be possible...

There is also a complete solution out there in the form of Costura.Fody, which exists as a convenient nuget package as well, see http://www.nuget.org/packages/Costura.Fody. This does, however, rely on IL rewriting which may be a problem for some. It does look as if it handles all possible issues via configuration, though.

Tuesday, February 18, 2014

Embedded Assembly Loading with support for Symbols and Portable Class Libraries in C#

Jeffrey Richter has previously written about how to deploy a single executable file for an application by embedding dependencies as resources in the main application assembly in Jeffrey Richter: Excerpt #2 from CLR via C#, Third Edition.

However, this solution has a few issues. It does not handle Portable Class Libraries (PCLs) and does not show how to support loading symbols from embedded pdb-files either. The code presented below handles both.

As with the solution Jeffrey Richter details, one simply adds a handler to the current domains AssemblyResolve event, which is called whenever an assembly could not be resolved directly. However, this also occurs when an embedded portable class library (such as Autofac) has defined a dependency towards any BCL assembly (e.g. System.Core 2.0.5.0), in this case you have to check if the assembly is retargetable and then load it directly via the usual CLR mechanism so the appropriate version is loaded.

For a better debug experience and better exception stack traces it is recommended to include pdb-files as well. pdb-files are handled by simply loading these if they have been embedded, have the same name as the dll-file and then using the Assembly.Load overload that also loads raw symbol data from a byte array.

To use the code do the following:
  • Call SetupEmbeddedAssemblyResolve as the first thing in your application
  • Add dependencies incl. pdb-files, if needed, to your project via Add as Link and change the build action to Embedded Resource and Copy to Output Directory to Do not copy
  • Change the assembly references properties under References and set Copy Local to false
UPDATE: See Automatically Embed Copy Local Assemblies with Symbols in MSBuild for how to automatically embed assemblies instead of doing this manually.
private static void SetupEmbeddedAssemblyResolve()
{
    AppDomain.CurrentDomain.AssemblyResolve += (sender, args) =>
    {
        var name = args.Name;
        var asmName = new AssemblyName(name);

        // Any retargetable assembly should be resolved directly using normal load e.g. System.Core issue 
        if (name.EndsWith("Retargetable=Yes"))
        {
            return Assembly.Load(asmName);
        }

        var executingAssembly = Assembly.GetExecutingAssembly();
        var resourceNames = executingAssembly.GetManifestResourceNames();

        var resourceToFind = asmName.Name + ".dll";
        var resourceName = resourceNames.SingleOrDefault(n => n.Contains(resourceToFind));

        if (string.IsNullOrWhiteSpace(resourceName)) { return null; }

        var symbolsToFind = asmName.Name + ".pdb";
        var symbolsName = resourceNames.SingleOrDefault(n => n.Contains(symbolsToFind));

        var assemblyData = LoadResourceBytes(executingAssembly, resourceName);

        if (string.IsNullOrWhiteSpace(symbolsName))
        { 
            Trace.WriteLine(string.Format("Loading '{0}' as embedded resource '{1}'", resourceToFind, resourceName));

            return Assembly.Load(assemblyData);
        }
        else
        {
            var symbolsData = LoadResourceBytes(executingAssembly, symbolsName);

            Trace.WriteLine(string.Format("Loading '{0}' as embedded resource '{1}' with symbols '{2}'", resourceToFind, resourceName, symbolsName));

            return Assembly.Load(assemblyData, symbolsData);
        }
    };
}

private static byte[] LoadResourceBytes(Assembly executingAssembly, string resourceName)
{
    using (var stream = executingAssembly.GetManifestResourceStream(resourceName))
    {
        var data = new byte[stream.Length];

        stream.Read(data, 0, data.Length);

        return data;
    }
}
Based on Jeffrey Richter: Excerpt #2 from CLR via C#, Third Edition and FileNotFoundException when trying to load Autofac as an embedded assembly.

Friday, May 10, 2013

Permanent Theme Detection Post

Nam erat sapien, facilisis consequat, molestie a, pretium ut, justo. Donec et elit. Maecenas convallis arcu a tortor. Nullam lacinia mattis nisl. Donec pellentesque vestibulum risus. Fusce non nulla eu nunc porttitor semper. Integer purus massa, ornare eget, malesuada ac, porttitor in, neque. Nunc ut est quis elit iaculis pretium. Vestibulum sagittis nulla. Maecenas bibendum ornare urna. Nulla facilisi. Phasellus condimentum turpis nec felis. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas.

Coding horror code example:

#region codinghorror.com
class Program : Object
{
    // Normal comment
    static int _I = 1;

    /// <summary>
    /// The quick brown fox jumps over the lazy dog
    /// THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
    /// </summary>
    static void Main(string[] args)
    {
        Uri Illegal1Uri = new Uri("http://packmyboxwith/jugs.html?q=five-dozen&t=liquor");
        Regex OperatorRegex = new Regex(@"\S#$", RegexOptions.IgnorePatternWhitespace);

        for (int O = 0; O < 123456789; O++)
        {
            _I += (O % 3) * ((O / 1) ^ 2) - 5;
            if (!OperatorRegex.IsMatch(Illegal1Uri.ToString()))
            {
                Console.WriteLine(Illegal1Uri);
            }
        }
    }
}
#endregion

Numbered list

  1. One
  2. Two
  3. Three

Bullet list

  • Test A
  • Test B
  • Test C