C++/CLI（二）Mono C++/CLI Native呼叫和P/Invoke呼叫

阿新 • • 發佈：2019-03-29

本文根據Mono C++原文件翻譯，這篇文章的目的，就是想說CLR程式在VS下面生成的DLL不能給Unity呼叫，因為Mono的Native呼叫的編碼和MS CLR的不一樣，如果Unity想要去呼叫C++程式，需要使用P/Invoke的方式，這兩者的不相容使得本來非常方便的C++/CLI在Unity下毫無用武之地，希望有一天MS能夠給Mono CLR一片土地，方便你我他，還有就是高高興興寫了半個月MS CLR以為能在Unity下使用了，結果一Run就炸，所以說以後程式碼未動，單元測試一定要先寫啊，這片區程式碼需要全部重構了，血與淚的教訓。

Introduction 簡介

============

The Common Language Infrastructure (CLI) is designed to make it "easy" to interoperate with existing code. In principle, all you need to do is create a DllImport function declaration for the existing code to invoke, and the runtime will handle the rest. For example:

公共語言介面(CLI)的設計目的是使與現有程式碼進行互操作變得“容易”。原則上，您所需要做的就是為要呼叫的程式碼建立一個DllImport函式宣告，執行時執行函式。例如:

 [DllImport ("libc.so")]
 private static extern int getpid ();

Please note that most of the classes and enumerations mentioned in this document reside in the System.Runtime.InteropServices namespace.
注意，本文件中提到的大多數類和方法都位於 System.Runtime.InteropServices

名稱空間裡

The above C# function declaration would invoke the POSIX getpid(2) system call on platforms that have the libc.so library. If libc.so exists but doesn't have the getpid export, an EntryPointNotFoundException exception is thrown. If libc.so can't be loaded, a DllNotFoundException exception is thrown. Simple. Straightforward. What could be easier?

There are three problems with this:

Specifying the library in the DllImport statement.
Determining what function to actually invoke.
Passing parameters; most existing code is far more complex. Strings will need to be passed, structures may need to be passed, memory management practices will become involved...

Existing code is a complex beast, and the interop layer needs to support this complexity.

Library Handling

How does the runtime find the library specified in the DllImport attribute? This question is inherently platform specific.

Windows DLL Search Path

From the MSDN LoadLibrary documentation, the DLLs needed by the program are searched for in the following order:

The directory from which the application loaded.
The current directory
The system directory. Use the GetSystemDirectory() function to get the path of this directory.
The 16-bit system directory.
The Windows directory. Use the GetWindowsDirectory() function to get the path of this directory.
The directories that are listed in the PATH environment variable.

Of course, reality isn't quite that simple. In practice, the "system" directory is actually %WINDIR%\system32, except on Windows 9x platforms where it's %WINDIR%\system. The 16-bit system directory is typically %WINDIR%\system, but isn't recognized as a separate search directory on Windows 9x platforms.

Furthermore, on Windows Server 2003 and Windows XP SP1, the registry entry HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\SafeDllSearchMode alters the above ordering. If this is 1 (the default), then the current directory is searched after the system and Windows directories. This is a security feature (it prevents a trojan library from being loaded instead of, for example, OLE32.DLL), but it turns the above list into: 1, 3, 4, 5, 2, 6.

Linux Shared Library Search Path

From the dlopen(3) man page, the necessary shared libraries needed by the program are searched for in the following order:

A colon-separated list of directories in the user's LD_LIBRARY_PATH environment variable. This is a frequently-used way to allow native shared libraries to be found by a CLI program.
The list of libraries cached in /etc/ld.so.cache. /etc/ld.so.cache is created by editing /etc/ld.so.conf and running ldconfig(8). Editing /etc/ld.so.conf is the preferred way to search additional directories, as opposed to using LD_LIBRARY_PATH, as this is more secure (it's more difficult to get a trojan library into /etc/ld.so.cache than it is to insert it into LD_LIBRARY_PATH).
/lib, followed by /usr/lib.

As a Mono extension, if the library being loaded is __Internal, then the main program is searched for method symbols. This is equivalent to calling dlopen(3) with a filename of NULL. This allows you to P/Invoke methods that are within an application that is embedding Mono.

See also: the dlopen(3) man page, the ld.so(8) man page, Dissecting shared libraries.

macOS Framework and .dylib Search Path

The Framework and library search path is:

A colon-separated list of directories in the user's DYLD_FRAMEWORK_PATH environment variable.
A colon-separated list of directories in the user's DYLD_LIBRARY_PATH environment variable.
A colon-separated list of directories in the user's DYLD_FALLBACK_FRAMEWORK_PATH environment variable, which defaults to the directories:
- ~/Library/Frameworks
- /Library/Frameworks
- /Network/Library/Frameworks
- /System/Library/Frameworks
A colon-separated list of directories in the user's DYLD_FALLBACK_LIBRARY_PATH environment variable, which defaults to the directories:
- ~/lib
- /usr/local/lib
- /lib
- /usr/lib

Note: Mono uses GLib to load libraries, and GLib has a bug on macOS where it doesn't use a .dylib extension, but instead uses the Unix .so extension. While this should eventually be fixed, the current workaround is to write a .config file which maps to the .dylib file, e.g.

 <configuration>
   <dllmap dll="mylib" target="mylib.dylib" />
 </configuration>

TODO: Will mono support both frameworks and dylibs?

Library Names

Knowing where to look for the library is only half of the problem. Knowing what library to load is the other half.

Different platforms have different naming conventions. Windows platforms append .DLL to the library name, such as OLE32.DLL. Linux platforms use a lib prefix and a .so suffix(see Note 1). macOS platforms have a lib prefix and a .dylib suffix, unless they're a Framework, in which case they're a directory and things get more complicated.

Note 1: Strictly speaking, Unix shared libraries are typically versioned, and the version number follows the .so suffix. For example, libfreetype.so.6.3.3 is a fully versioned library. Versioning throws a "wrench" into the works, and is best dealt with through Mono's <dllmap/> mechanism; see below for details.

If you have control over the library name, keep the above naming conventions in mind and don't use a platform-specific library name in the DllImport statement. Instead, just use the library name itself, without any prefixes or suffixes, and rely on the runtime to find the appropriate library at runtime. For example:

 [DllImport ("MyLibrary")]
 private static extern void Frobnicate ();

Then, you just need to provide MyLibrary.dll for Windows platforms, libMyLibrary.so for Unix platforms, and libMyLibrary.dylib for macOS platforms.

Note: Windows will not automatically append a .dll extension to library names that already have a period (.) in their name, such as libgtk-win32-2.0-0.dll. If you try to use libgtk-win32-2.0-0 as the library name, Windows won't automatically append .dll, resulting in a DllNotFoundException. Consequently you should either avoid periods in library names or always use the full filename (including the .dll extension) and rely on Mono's <dllmap/> mechanism.

What if you don't have the same name across all platforms? For example, the GTK+ library name on Windows is libgtk-win32-2.0-0.dll, while the Unix equivalent library is libgtk-x11-2.0.so. How do you write portable Platform Invoke (P/Invoke) code that will work cross-platform?

The short answer is that you don't. There is no standard way of specifying platform-specific library names.

However, as an extension, Mono provides a library mapping mechanism. Two places are searched for library mappings: in the $prefix/etc/mono/config XML file, and in a per-assembly .config file, located in the same directory as the assembly. The .config file must be named like the assembly with ".config" as extension, e.g. MyAssembly.exe.config or MyAssembly.dll.config. These files contains <dllmap/> elements, which map an input library (the library specified in the DllImport statement) to the actual platform-specific library to load. For example:

 <configuration>
    <dllmap dll="libgtk-win32-2.0-0.dll" target="libgtk-x11-2.0.so" />
 </configuration>

Unlike .NET, Mono permits .DLL assemblies to have .config files, which are only used for this library mapping mechanism.

Using this mechanism, the Mono-endorsed way of specifying DllImport library names is to always use the Windows library name (as Microsoft .NET has no library mapping mechanism), and then provide a mapping in the per-assembly .config file. This is what the Gtk# library does.

This mechanism can also be used to load strongly-versioned libraries on Unix platforms. For example:

 <configuration>
   <dllmap dll="gtkhtml-3.0" target="libgtkhtml-3.0.so.4" />
 </configuration>

Invoking Unmanaged Code

As far as managed code is concerned, unmanaged code is invoked merely by invoking a method with an associated DllImport attribute. The CLI runtime must do more work to actually invoke the unmanaged code.

In principle, this is a straightforward process. The library specified in the DllImport attribute is loaded, as described above. Then, the specified function is looked up (via GetProcAddress() or dlsym(3)). Finally, the function is invoked.

But what string is used for the function lookup (in GetProcAddress() or dlopen(3))? By default, the name of the managed code method is used, which is why getpid() in the above example invokes getpid(2) from the C library.

Alternatively, the DllImport attribute's EntryPoint field can be set, and that string will be used instead.

Either way, the string used is assumed to refer to a C ABI-compatible function exported by the specified library. On some platforms, this may cause a leading underscore to be prefixed to the symbol name. Other platforms generate no mangling.

Note that a C ABI is assumed. This makes it nearly impossible to directly invoke functions that are not C ABI compatible, such as C++ library functions that are not extern "C". Some variation on the C ABI is permitted, such as variation in the function's CallingConvention. The default CallingConvention is platform-specific. Under Windows, Winapi is the default, as this is used for most Win32 API functions. (Winapi is equivalent to Stdcall for Windows 9x and Windows NT.) Under Unix platforms, Cdecl is the default.

Calling convention can be specified in C code by using the __stdcall and __cdecl compiler intrinsics under Microsoft Visual C++, and by using the __attribute__((stdcall)) and __attribute__((cdecl)) compiler intrinsics under GCC.

Does having the default CallingConvention vary between platforms cause portability problems? Yes. All the more reason to write as much code as possible as managed code, avoiding the whole P/Invoke/marshaling conundrum in the first place.

If you need to invoke C++ code, you have two choices: (1) make the C++ function extern "C", treat it as a C function, and make sure that it uses a known calling convention; (2) don't make the function extern "C", but make sure it uses a known calling convention. If you use option (2), you'll need to set the DllImport.EntryPoint field to the C++ mangled function name, such as _Z6getpidv. You can retrieve the mangled name through your compiler's binary tools, such as OBJDUMP.EXE or nm(1). Note that C++ mangled names are highly compiler specific, and will:

make your .NET assembly platform specific (you'll need a different assembly for each different platform);
require updating the .NET assembly every time you change C++ compilers (as the C++ name mangling scheme varies by compiler and can -- and frequently will -- change); and
be really ugly to maintain because of the above. This option is not recommended.

If you have lots of C++ code that needs to be wrapped, you might want to look into SWIG, a code generation program that easily wraps existing C and C++ code for use by a multitude of languages, including CLI languages. This makes it easier to invoke C++ code from a CLI application.

In case you call a function that is not present in the native library (or that is not public) you will get an EntryNotFoundException. In order to find out which symbols are available for a library, it's interesting to use the following command (the example used is a shared library from Subversion):

objdump -T /usr/lib/libsvn_client-1.so.0

Runtime Exception Propagation

The above section mentioned a key point: P/Invoke assumes that the unmanaged code conforms to the C ABI. C doesn't support exceptions. As such, it is assumed that runtime exceptions will not propagate through unmanaged code.

Furthermore, it's fairly simple for an exception to propagate through unmanaged code whenever unmanaged code invokes managed code. This typically occurs through the use of callbacks -- using a function pointer on the unmanaged side which can invoke a delegate on the managed side. It is very important that the managed code not propagate any exceptions -- it must catch all exceptions, or else the unmanaged code calling the delegate will break.

The problem is, again, C doesn't support exceptions. C++ supports exceptions, BUT, and this is crucial, the C++ exception mechanism will be different from the managed code exception mechanism (with one exception to this rule). Since managed code doesn't know about unmanaged code's exception handling support (C is assumed, and C doesn't support exceptions), unmanaged exception handling support might as well not exist, because it won't be used.

The one exception to this is when you use both Microsoft .NET and Microsoft Visual C++ to compile the unmanaged code. .NET uses Windows Structured Exception Handling (SEH) at the P/Invoke layer for its exception handling mechanism, and Microsoft Visual C++ uses SEH to implement C++ exception handling and supports the use of SEH in C as a language extension through the __try, __except, and __finally keywords. SEH is a Microsoft extension; it does not exist outside of Microsoft and .NET, and as such is not portable.

Given the above scenario -- unmanaged code invokes function pointer which generates a managed exception -- what would happen? The managed exception handling mechanism is executed: the stack is searched for an appropriate exception handler, then the stack is unwound, with any finally blocks executed during the stack unwind process.

Note two things: Managed code will be walking the stack, requiring that the CPU Stack Pointer and Instruction Pointers be set. Consequently, unmanaged code cannot participate in stack unwinding, as it will never be notified that a stack unwind is occurring.

Think about that for a minute. If alarms are not sounding in your head, you're in deep, deep trouble. Consider this unmanaged C code:

 typedef void (*Handler) (const char *message);
 
 void InvokeHandler (Handler handler)
 {
   char *message = (char *) malloc (10);
   strcpy (message, "A Message");
   (*handler)(message);
   free (message);
 }

If handler is a pointer to a managed delegate which may throw an exception, then free(3) will not be executed, resulting in a memory leak. C++ destructors won't help you either, as destructors still require the execution of some code, and that code will never be invoked, as it's not C++ which is unwinding the stack, but managed code, which doesn't know about C++ exception handling.

Obviously, the flip-side of this scenario -- a C++ exception being propagated into managed code -- is equally bad. As long as managed and unmanaged code use different exception handling mechanisms, exceptions must not be mixed between them.

The moral of this story: don't let exceptions propagate between managed and unmanaged code. The results won't be pretty.

This is particularly pertinent when wrapping C++ methods. C++ exceptions will need to be mapped into an "out" parameter or a return value, so that managed code can know what error occurred, and (optionally) throw a managed exception to "propagate" the original C++ exception.

Marshaling

How does Platform Invoke work? Given a managed call site (the function call), and an unmanaged callee site (the function that's being called), each parameter in the call site is "marshaled" (converted) into an unmanaged equivalent. The marshaled data is in turn placed on the runtime stack (along with other data), and the unmanaged function is invoked.

The complexity is due to the marshaling. For simple types, such as integers and floating-point numbers, marshaling is a bitwise-copy ("blitting"), just as would be the case for unmanaged code. In some cases, marshaling can be avoided, such as when passing structures by reference to unmanaged code (a pointer to the structure is copied instead). It's also possible to obtain more control over marshaling, through custom marshaling and manual marshaling.

String types introduce additional complexity, as you need to specify the form of string conversion. The runtime stores strings as UTF-16-encoded strings, and these will likely need to be marshaled to a more appropriate form (ANSI strings, UTF-8 encoded strings, etc.). Strings get some special support.

Default marshaling behavior is controlled through the DllImport and MarshalAs attributes.

Memory Boundaries

Managed and unmanaged memory should be considered to be completely separate. Managed memory is typically memory allocated on a garbage-collected heap, while unmanaged memory is anything else: the ANSI C memory pool allocated through malloc(3), custom memory pools, and garbage-allocated heaps outside the control of the CLI implementation (such as a LISP or Scheme memory heap).

It is possible to lock a section of the managed heap by using the C# fixed statement. This is used so that a section of the managed heap can be passed to unmanaged code without worrying that a future GC will move the memory that the unmanaged code is operating on. However, this is completely under the control of the programmer, and is not how Platform Invoke works.

During a P/Invoke call the runtime doesn't mimic the C# fixed statement. Instead, classes and structures (everything of consequence) are marshaled to native code through the following pseudo-process:

The runtime allocates a chunk of unmanaged memory.
The managed class data is copied into the unmanaged memory.
The unmanaged function is invoked, passing it the unmanaged memory information instead of the managed memory information. This must be done so that if a GC occurs, the unmanaged function doesn't need to worry about it. (And yes, you need to worry about GCs, as the unmanaged function could call back into the runtime, ultimately leading to a GC. Multi-threaded code can also cause a GC while unmanaged code is executing.)
The unmanaged memory is copied back into managed memory.

See Class and Structure Marshaling for more detailed information about marshaling classes and structures.

There is one key point to keep in mind: the memory management specified in the above process is implicit, and there is no way to control how the runtime allocates the marshaled memory, or how long it lasts. This is crucial. If the runtime marshals a string (e.g. UTF-16 to Ansi conversion), the marshaled string will only last as long as the call. The unmanaged code CANNOT keep a reference to this memory, as it WILL be freed after the call ends. Failure to heed this restriction can result in "strange behavior", including memory access violations and process death. This is true for any marshaling process where the runtime allocates memory for the marshal process.

The one pseudo-exception to this point is with delegates. The unmanaged function pointer that represents the managed delegate lasts as long as the managed delegate does. When the delegate is collected by the GC, the unmanaged function pointer will also be collected. This is also important: if the delegate is collected and unmanaged memory invokes the function pointer, you're treading on thin ground. Anything could happen, including a process seg-fault. Consequently, you MUST ensure that the lifetime of the unmanaged function pointer is a proper subset of the lifetime of the managed delegate instance.

Blittable Types

Many types require minimal copying into native memory. Blittable types are types that conceptually only require a memcpy(3) or can be passed on the run-time stack without translation. These types include:

C# Type	C Type	Type	Type
`sbyte`	`char`	`int8_t`	`gint8`
`byte`	`unsigned char`	`uint8_t`	`guint8`
`short`	`short`	`int16_t`	`gint16`
`ushort`	`unsigned short`	`uint16_t`	`guint16`
`int`	`int` `long` 32-bit platforms only	`int32_t`	`gint32`
`uint`	`unsigned int` `unsigned long` 32-bit platforms only	`uint32_t`	`guint32`
`long`	`long` 64-bit platforms only `__int64` MSVC `long long` GCC	`int64_t`	`gint64`
`long`	`unsigned long` 64-bit platforms only `unsigned __int64` MSVC `unsigned long long` GCC	`uint64_t`	`guint64`
`char`	`unsigned short`	`uint16_t`	`guint16`
`float`	`float`		`gfloat`
`double`	`double`		`gdouble`
`bool`	Depends on context

Strings

Strings are special. String marshaling behavior is also highly platform dependent.

String marshaling for a function call can be specified in the function declaration with the DllImport attribute, by setting the CharSet field. The default value for this field is CharSet.Ansi. The CharSet.Auto value implies "magic."

Some background. The Microsoft Win32 API supports two forms of strings: "ANSI" strings, the native character set, such as ASCII, ISO-8859-1, or a Double Byte Character Set such as Shift-JIS; and Unicode strings, originally UCS-2, and now UTF-16. Windows supports these string formats by appending an "A" for Ansi string APIs and a "W" ("wide") for Unicode string APIs.

Consider this Win32 API description:

 [DllImport ("gdi32.dll", CharSet=CharSet.Auto,
      CallingConvention=CallingConvention.StdCall)]
 private static extern bool TextOut (
      System.IntPtr hdc,
      int nXStart,
      int nYStart,
      string lpString,
      int cbString);

When TextOut is called, the "magic" properties of String marshaling become apparent. Due to string marshaling, the runtime doesn't just look for an unmanaged function with the same name as the specified method, as specified in Invoking Unmanaged Code. Other permutations of the function may be searched for, depending on the CLI runtime and the host platform.

There are three functions that may be searched for:

TextOutW for Unicode string marshaling
TextOutA for Ansi string marshaling
TextOut with the platform-default marshaling

For platforms whose default character set is UCS2 or UTF-16 Unicode (all flavors of Windows NT, and Windows XP), the default search path is TextOutW, TextOutA, and TextOut. Unicode marshaling is preferred, as (ideally) the System.String can be passed as-is to the function, as long as the function doesn't modify the string parameter. Windows CE does not look for TextOutA, as it has no Ansi APIs.

For platforms whose default character set is Ansi (Windows 9x, Windows ME), the default search path is TextOutA and TextOut (TextOutW is not looked for). Ansi marshaling will require translating the Unicode string into an 8-bit or DBCS string in the user's locale. Most (all?) of the time, this WILL NOT be UTF-8, so you CAN NOT assume that CharSet.Ansi will generate UTF-8-encoded strings.

Mono on all platforms currently uses UTF-8 encoding for all string marshaling operations.

If you don't want the runtime to search for the alternate unmanaged functions, specify a CharSet value other than CharSet.Auto. This will cause the runtime to look only for the specified function. Note that if you pass a wrongly encoded string (e.g. calling MessageBoxW when the CharSet is CharSet.Ansi, the default), you are crossing into "undefined" territory. The unmanaged function will receive data encoded in ways it wasn't expecting, so you may get such bizarre things as Asian text when displaying "Hello, World".

Perhaps in the future the CharSet enumeration will contain more choices, such as UnicodeLE (little-endian), UnicodeBE (big-endian), Utf7, Utf8, and other common choices. Additionally, making such a change would also likely require changing the UnmanagedType enumeration. However, these would need to go through ECMA, so it won't happen next week. (Unless some time has passed since this was originally written, in which case it may very well be next week. But don't count on it.)

More Control

Using the DllImport attribute works if you want to control all the strings in a function, but what if you need more control? You would need more control if a string is a member of a structure, or if the function uses multiple different types of strings as parameters. In these circumstances, the MarshalAs attribute can be used, setting the Value property (which is set in the constructor) to a value from the UnmanagedType enumeration. For example:

 [DllImport ("does-not-exist")]
 private static extern void Foo (
      [MarshalAs(UnmanagedType.LPStr)] string ansiString,
      [MarshalAs(UnmanagedType.LPWStr)] string unicodeString,
      [MarshalAs(UnmanagedType.LPTStr)] string platformString);

As you can guess by reading the example, UnmanagedType.LPStr will marshal the input string into an Ansi string, UnmanagedType.LPWStr will marshal the input string into a Unicode string (effectively doing nothing), and UnmanagedType.LPTStr will convert the string to the platform's default string encoding.

The default platform encoding for all flavors of Windows NT (including Windows NT 3.51 and 4.0, Windows 2000, Windows XP, Windows Server 2003) is Unicode, while for all Windows 9x flavors (Windows 95, 98, ME) the platform default encoding is Ansi.

Mono uses UTF-8 encoding as the default encoding on all platforms.

There are other UnmangedType string marshaling options, but they're primarily of interest in COM Interop (BStr, AnsiBStr, TBStr).

If UnmanagedType doesn't provide enough flexibility for your string marshaling needs (for example, you're wrapping GTK+ and you need to marshal strings in UTF-8 format), look at the Custom Marshaling or Manual Marshaling sections.

Passing Caller-Modifiable Strings

A common C language idiom is for the caller to provide the callee a buffer to fill. For example, consider strncpy(3):

 char* strncpy (char *dest, const char *src, size_t n);

We can't use System.String for both parameters, as strings are immutable. This is OK for src, but dest will be modified, and the caller should be able to see the modification.

The solution is to use a System.Text.StringBuilder , which gets special marshaling support from the runtime. This would allow strncpy(3) to be wrapped and used as:

 [DllImport ("libc.so")]
 private static extern void strncpy (StringBuilder dest,
      string src, uint n);
 
 private static void UseStrncpy ()
 {
    StringBuilder sb = new StringBuilder (256);
    strncpy (sb, "this is the source string", sb.Capacity);
    Console.WriteLine (sb.ToString());
 }

Some things to note is that the return value of strncpy(3) was changed to void, as there is no way to specify that the return value will be the same pointer address as the input dest string buffer, and thus it doesn't need to be marshaled. If string were used instead, Bad Things could happen (the returned string would be freed; see Strings as Return Values ). The StringBuilder is allocated with the correct amount of storage as a constructor parameter, and this amount of storage is passed to strncpy(3) to prevent buffer overflow. If you use a StringBuilder instance multiple times, always call EnsureCapacity() before passing it into the native method, as the capacity may shrink as a memory optimization over time, leading to unexpectedly truncated results.

TODO: How does StringBuilder interact with the specified CharSet?

Strings as Return Values

The String type is a class, so see the section on returning classes from functions. Summary: the runtime will attempt to free the returned pointer. The usual symptom is a runtime crash like this:

=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================

Stacktrace:

in <0x4> (wrapper managed-to-native) System.Object:__icall_wrapper_g_free (intptr)
in <0x6b9d0c> (wrapper managed-to-native) System.Object:__icall_wrapper_g_free (intptr)

If you don't want the runtime to free the returned string, either (a) don't specify the return value (as was done for the strncpy(3) function above), or (b) return an IntPtr and use one of the Marshal.PtrToString* functions, depending on the type of string returned. For example, use Marshal.PtrToStringAnsi to marshal from a Ansi string, and use Marshal.PtrToStringUni to marshal from a Unicode string.

Class and Structure Marshaling

The conceptual steps that occur to marshal classes and structures is detailed above, in the Memory Boundaries section.

The main difference between class and structure marshaling is which ones, if any, of the conceptual steps actually occur.

Class Marshaling

Remember that classes are heap-allocated and garbage-collected in the CLI. As such, you cannot pass classes by value to unmanaged functions, only by reference:

 /* Unmanaged code declarations */
 struct UnmanagedStruct {
    int a, b, c;
 };
 
 void WRONG (struct UnamangedStruct pass_by_value);
 
 void RIGHT (struct UnmanagedStruct *pass_by_reference);
 void RIGHT2 (struct UnmanagedStruct **pass_by_reference_out_or_ref);

This means that you cannot use classes to invoke unmanaged functions that expect pass-by-value variables (such as the WRONG function, above).

There are two other issues with classes. First of all, classes by default use LayoutKind.Auto layout. This means that the ordering of class data members is unknown, and won't be determined until runtime. The runtime can rearrange the order of members in any way it chooses, to optimize for access time or data layout space. As such, you MUST use the StructLayout attribute and specify a LayoutKind value of LayoutKind.Sequential or LayoutKind.Explicit.

Secondly, classes (again, by default) only have in-bound marshaling. That is, Step 4 (copying the unmanaged memory representation back into managed memory) is ommitted. If you need the unmanaged memory to be copied back into managed memory, you must addorn the DllImport function declaration argument with an Out attribute. You will also need to use the In attribute if you want copy-in and copy-out behavior. To summarize:

Using [In] is equivalent to not specifying any parameter attributes, and will skip Step 4 (copying unmanaged memory into managed memory).
Using [Out] will skip Step 2 (copying managed memory into unmanaged memory).
Use [In, Out] to both copy managed memory to unmanaged memory before the unmanaged function call, and then copy unmanaged memory back to managed memory after the function call.

In some circumstances, the marshaled copy can be omitted. The object will simply be pinned in memory and a pointer to the start of the data passed to the unmanaged function.

TODO: When can this actually occur? If this happened for any class with Sequential layout, you wouldn't need to specify the Out attribute, as the unmanaged code would see the actual object. Is there a specific set of circumstances for when this can occur? This appears to happen with StringBuilder (my tests don't require an [Out] to see changes made to the StringBuilder by unmanaged code), but this is the only example I can think of.

Structure Marshaling

There are two primary differences between classes and structures. First, structures do not need to be allocated on the heap; they can be allocated on the runtime stack. Secondly, they are LayoutKind.Sequential by default, so structure declarations do not need any additional attributes to use them with unmanaged code (assuming that the default sequential layout rules are correct for the unmanaged structure).

These differences permit structures to be passed by-value to unmanaged functions, unlike classes. Additionally, if (a) the structure is located on the stack, and (b) the structure contains only blittable types, then if you pass a structure to an unmanaged function by-reference, the structure will be passed directly to the unmanaged function, without an intermediate unmanaged memory copy. This means that you may not need to specify the Out attribute to see changes made by unmanaged code.

Note that as soon as the structure contains a non-blittable type (such as System.Boolean, System.String, or an array), this optimization is no longer possible and a copy of the structure must be made as part of the marshaling process.

Classes and Structures as Return Values

The differences in allocation behavior between classes and structures also affect how they're handled as return values from functions.

Classes can be used as the return value of a function when the unmanaged function returns a pointer to an unmanaged structure. Classes cannot be used for by-value return types.

Structures can be used when the unmanaged function returns the structure by-value. It is not possible to return structures with "ref" or "out", so if an unmanaged function returns a pointer to a structure, IntPtr must be used for "safe" code, or a pointer to the structure can be used for "unsafe" code. If IntPtr is used as the return type, Marshal.PtrToStructure can be used to convert the unmanaged pointer into a managed structure.

Memory management is also heavily involved.

Memory Management

It's easy to skim over memory management for most of Platform Invoke and marshaling, but for return values the CLI implements some default handling which must be considered.

The CLI runtime assumes that, under certain circumstances, the CLI runtime is responsible for freeing memory allocated by unmanaged code. Return values are one of those circumstances, causing the return value to be a memory boundary for control of memory (de)allocation.

The CLI assumes that all memory that is passed between the CLI/unmanaged code boundary is allocated via a common memory allocator. The developer does not get a choice in which memory allocator is used. For managed code, the Marshal.AllocCoTaskMem method can be used to allocate memory, Marshal.FreeCoTaskMem is used to free the memory allocated by Marshal.AllocCoTaskMem, and Marshal.ReAllocCoTaskMem is used to resize a memory region originally allocated by Marshal.AllocCoTaskMem.

Since classes are passed by reference, a pointer is returned, and the runtime assumes that it must free this memory to avoid a memory leak. The chain of events is thus:

Managed code invokes unmanaged function that returns a pointer to an unmanaged structure in unmanaged memory.
An instance of the appropriate managed class is instantiated, and the contents of the unmanaged memory is marshaled into the managed class.
The unmanaged memory is freed by the runtime "as if" by invoking Marshal.FreeCoTaskMem().

How is Marshal.AllocCoTaskMem, Marshal.ReAllocCoTaskMem, and Marshal.FreeCoTaskMem implemented? That's platform-dependent. (So much for portable platform-dependent code.) Under Windows, the COM Task Memory allocator is used (via CoTaskMemAlloc(), CoTaskMemReAlloc(), and CoTaskMemFree()). Under Unix, the GLib memory functions g_malloc(), g_realloc(), and g_free() functions are used. Typically, these correspond to the ANSI C functions malloc(3), realloc(3), and free(3), but this is not necessarily the case as GLib can use different memory allocators; see g_mem_set_vtable() and g_mem_is_system_malloc() .

What do you do if you don't want the runtime to free the memory? Don't return a class. Instead, return an IntPtr (the moral equivalent of a C void* pointer), and then use the Marshal class methods to manipulate that pointer, such as Marshal.PtrToStructure, which works for both C# struct types and class types marked [StructLayout(LayoutKind.Sequential)].

Choosing between Classes and Structures

So which should be used when wrapping unmanaged code, classes or structures?

Generally, the answer to this question depends upon what the unmanaged code requires. If you require pass-by-value semantics, you must use structures. If you want to return a pointer to an unmanaged type without resorting to "unsafe" or manual code, you must use classes (assuming that the default memory allocation rules are appropriate).

For the large intersection of unmanaged code that doesn't have pass-by-value structures or return pointers to structures from functions? Use whichever is more convenient for the end user. Not all languages support passing types by reference (Java, for example), so using classes will permit a larger body of languages to use the wrapper library. Furthermore, Microsoft suggests that structure sizes not exceed 16 bytes.

Summary

It's always easier to show the code, so... Given the following unmanaged code declarations:

 /* unmanaged code declarations */
 
 struct UnmanagedStruct {
    int n;
 };
 
 void PassByValue (struct UnmanagedStruct s);
 
 void PassByReferenceIn (struct UnmanagedStruct *s);
 void PassByReferenceOut (struct UnmanagedStruct *s);
 void PassByReferenceInOut (struct UnmanagedStruct *s);
 
 struct UnmanagedStruct ReturnByValue ();
 struct UnmanagedStruct* ReturnByReference ();
 
 void DoubleIndirection (struct UnmanagedStruct **s);

The class wrapper could be:

 /* note: sequential layout */
 [StructLayout (LayoutKind.Sequential)]
 class ClassWrapper {
    public int n;
 
    /* cannot wrap function PassByValue */
 
    /* PassByReferenceIn */
    [DllImport ("mylib")]
    public static extern
       void PassByReferenceIn (ClassWrapper s);
 
    /* PassByReferenceOut */
    [DllImport ("mylib")]
    public static extern
       void PassByReferenceOut ([Out] ClassWrapper s);
 
    /* PassByReferenceInOut */
    [DllImport ("mylib")]
    public static extern
       void PassByReferenceInOut ([In, Out] ClassWrapper s);
 
    /* cannot wrap function ReturnByValue */
 
    /* ReturnByReference */
    [DllImport ("mylib")]
    public static extern ClassWrapper ReturnByReference ();
       /* note: this causes returned pointer to be freed
          by runtime */
     /* DoubleIndirection */
    [DllImport ("mylib")]
    public static extern
       void DoubeIndirection (ref ClassWrapper s);
 }

While the structure wrapper could be:

 struct StructWrapper {
    public int n;
 
    /* PassByValue */
    [DllImport ("mylib")]
    public static extern void PassByValue (StructWrapper s);
 
    /* PassByReferenceIn */
    [DllImport ("mylib")]
    public static extern void PassByReferenceIn (
       ref StructWrapper s);
 
    /* PassByReferenceOut */
    [DllImport ("mylib")]
    public static extern void PassByReferenceOut (
       out StructWrapper s);
 
    /* PassByReferenceInOut */
    [DllImport ("mylib")]
    public static extern void PassByReferenceInOut (
       ref StructWrapper s);
 
    /* ReturnByValue */
    [DllImport ("mylib")]
    public static extern StructWrapper ReturnByValue ();
 
    /* ReturnByReference: CLS-compliant way */
    [DllImport ("mylib", EntryPoint="ReturnByReference")]
    public static extern IntPtr ReturnByReferenceCLS ();
       /* note: this DOES NOT cause returned pointer to be
          freed by the runtime, so it's not identical to
          ClassWrapper.ReturnByReference.
          Use Marshal.PtrToStructure() to access the
          underlying structure. */
 
    /* ReturnByReference: "unsafe" way */
    [DllImport ("mylib", EntryPoint="ReturnByReference")]
    public static unsafe extern StructWrapper*
       ReturnByReferenceUnsafe ();
       /* note: this DOES NOT cause returned pointer to be
          freed by the runtime, so it's not identical to
          ClassWrapper.ReturnByReference */
 
    /* DoubleIndirection: CLS-compliant way */
    [DllImport ("mylib", EntryPoint="DoubleIndirection")]
    public static extern
       void DoubeIndirectionCLS (ref IntPtr s);
       /* note: this is similar to ReturnByReferenceCLS().
          Pass a `ref IntPtr' to the function, then use
          Marshal.PtrToStructure() to access the
          underlying structure. */
 
    /* DoubleIndirection: "unsafe" way */
    [DllImport ("mylib", EntryPoint="DoubleIndirection")]
    public static unsafe extern
       void DoubeIndirectionUnsafe (StructWrapper **s);
 }

Marshaling Class and Structure Members

Aside from the major differences between classes and structures outlined above, the members of classes and structures are marshaled identically.

The general rule of advice is this: never pass classes or structures containing members of reference type (classes) to unmanaged code. This is because unmanaged code can't do anything safely with the unmanaged reference (pointer), and the CLI runtime doesn't do a "deep marshal" (marshal members of marshaled classes, and their members, ad infinitum).

The immediate net effect of this is that you can't have array members in marshaled classes, and (as we've seen before) handling strings can be "wonky" (as strings are also a reference type).

Furthermore, the default string marshaling is the platform default, though this can be changed by setting the StructLayoutAttribute.CharSet field, which defaults to CharSet.Auto. Alternatively, you can adorn string members with the MarshalAs attribute to specify what kind of string they are.

Boolean Members

The System.Boolean (bool in C#) type is special. A bool within a structure is marshaled as an int (a 4-byte integer), with 0 being false and non-zero being true; see UnmanagedType.Bool. A bool passed as an argument to a function is marshaled as a short (a 2-byte integer), with 0 being false and -1 being true (as all bits are set); see UnmanagedType.VariantBool.

You can always explicitly specify the marshaling to use by using the MarshalAsAttribute on the boolean member, but there are only three legal UnmanagedType values: UnmanagedType.Bool, UnmanagedType.VariantBool and UnmanagedType.U1. UnmanagedType.U1, the only un-discussed type, is a 1-byte integer where 1 represents true and 0 represents false.

If you need to marshal as another data type, you should overload the method accepting the boolean parameter, and manually convert the boolean to your desired type:

 // Unmanaged C declaration:
 void DoSomething (int boolean);

 // Managed declaration:
 [DllImport ("SomeLibrary")]
 private static extern void DoSomething (int boolean);
 
 public static void DoSomething (bool boolean)
 {
    DoSomething (boolean ? 1 : 0);
 }

Unions

A C union (in which multiple members share the same offset into a structure) can be simulated by using the FieldOffset attribute and specifying the same offset for the union members.

Longs

The C 'long' type is difficult to marshal as a struct member, since there is no CLR type which matches it, i.e. 'int' is 32 bit, 'long' is 64 bit, while C's 'long' can be 32 bit or 64 bit, dependending on the platform. There are two possible solutions:

Using two sets of structures, one for 32 bit and one for 64 bit platforms.
Mapping C 'long' to 'IntPtr'. This will work on all 32 bit and 64 bit platforms, _except_ 64 bit windows, where sizeof(long)==4 and sizeof(void*)==8. See This.

Arrays Embedded Within Structures

Inline arrays can be marshaled by using a MarshalAs attribute with UnmanagedType.ByValArray and specifying the MarshalAsAttribute.SizeConst field to the size of the array to marshal. Inline arrays which contain strings can use UnmanagedType.ByValTStr for a string.

However, the runtime doesn't automatically allocate arrays specified as UnmanagedType.ByValArray. The programmer is still responsible for allocating the managed array. See the summary for more information.

TODO: Bernie Solomon says that for out parameters, the runtime will allocate the inline array memory. Check this out.

For example, the unmanaged structure:

 struct UnmanagedStruct {
    int data[10];
    char name[32];
 };

Can be represented in C# as:

 struct ManagedStruct_Slow {
    [MarshalAs (UnmanagedType.ByValArray, SizeConst=10)]
    public int[]  data;
    [MarshalAs (UnmanagedType.ByValTStr, SizeConst=32)]
    public string name;
 }

Of course, the managed structure can be declared in other ways, with varying performance and usage tradeoffs. The previous declaration is the most straightforward to use, but has the worst performance characteristics. The following structure will marshal faster, but will be more difficult to work with:

 struct ManagedStruct_Fast_1 {
    public int  data_0, data_1, data_2, data_3, data_4,
                data_5, data_6, data_7, data_8, data_9;
    public byte name_00, name_01, name_02, name_03, name_04,
                name_05, name_06, name_07, name_08, name_09,
                name_10, name_11, name_12, name_13, name_14,
                name_15, name_16, name_17, name_18, name_19,
                name_20, name_21, name_22, name_23, name_24,
                name_25, name_26, name_27, name_28, name_29,
                name_30, name_31,
 }

Yet another alternative is to directly specify the size of the structure, instead of letting the structure contents dictate the structure size. This is done via the StructLayout.Size field. This makes the structure terribly annoying to deal with, as pointer arithmetic must be used to deal with the name member:

 [StructLayout(LayoutKind.Sequential, Size=72)]
 struct ManagedStruct_Fast_2 {
    public int  data_0, data_1, data_2, data_3, data_4,
                data_5, data_6, data_7, data_8, data_9;
    public byte name; /* first byte of name */
    /* Size property specifies that 31 bytes of nameless
       "space" is placed here. */
 }

C# 2.0 Functionality

C# 2.0 adds language features to deal with inline arrays, using a fixed array syntax. This allows the previous structure to be declard as:

 struct ManagedStruct_v2 {
    public fixed int  data[10];
    public fixed byte name[32];
 }

Fixed array syntax is still "unsafe", and requires elevated privilege to execute.

Real World Experience

This might be of use. From David Jesk (http://www.chat.net/~jeske/):

Th