Html2pdf_chromium.dll fails in Azure with non-descriptive error message

Product: PDFTron.NETCore.Windows.x64 and HTML2PDF

Product Version: 9.3.1

Please give a brief summary of your issue: html2pdf_chromium.dll refuses to run in my Azure Web App (Windows, PremiumV2 mode). The error message in the logs says the operation completed successfully, which is obviously false.

Please describe your issue and provide steps to reproduce it:
I’m developing a web API that uses the HTML2PDF add-on to PDFTron to try to convert certain HTML strings to PDF. The conversion works when I’m running my webapp locally, but when I deploy it to Azure as an app service, the conversion always fails, and the error message I get isn’t very helpful.

Code snippet:

protected PdfDocument GetHtmlConvertedPortionOfDoc()
	{
	List<string> htmlStrings = new List<string>()
		{
		"<html><head></head><body>Hello World!</body></html>"
		}/*GetHtmlToRender();*/;
	Log.Information("Found {NumHtmlStrings} strings to render", htmlStrings?.Count ?? 0);
	if (htmlStrings == null || htmlStrings.Count == 0) { return null; }

	lock (PdfLock)
		{
		if (!initializedRenderer)
			{			
			// The PDFNet module itself is already initialized elsewhere.
			// So we just need to initialize the HTML2PDF add-on.
			HTML2PDF.SetModulePath("HTML2PDF");
			if (!HTML2PDF.IsModuleAvailable())
				{
				throw new Exception("HTML2PDF was not initialized correctly.");
				}

			initializedRenderer = true;
			}
		}

	var combinedDoc = new PDFDoc();
	Stopwatch watch = new Stopwatch();
	for (int i = 0; i < htmlStrings.Count; i++)
		{
		Log.Information("HTML string length is {HtmlLength}", htmlStrings[i]?.Length ?? 0);
		watch.Restart();
		using (var doc = new PDFDoc())
		using (var converter = new HTML2PDF())
			{
			// The header is 3 lines tall on the first page and 2 lines tall on other pages.
			// So allow space for the 2-line header if a header is in use, and CSS will add the additional space
			// taken by the 3rd header line on the first page if needed.
			double topMargin = VerticalMarginPt;
			if (Options.ShowHeader)
				{
				topMargin += LineSpacing + Font.Size * 2;
				}

			// The footer is 1 line (a page number) so add space for it if needed.
			double bottomMargin = VerticalMarginPt;
			if (Options.ShowHeader)
				{
				bottomMargin += Font.Size;
				}

			converter.SetMargins($"{topMargin}pt", $"{bottomMargin + 10}pt", $"{HorizontalMarginPt}px", $"{HorizontalMarginPt}px");
			var settings = new HTML2PDF.WebPageSettings();
			settings.SetPrintBackground(true);
			settings.SetJavaScriptDelay(2000);

			converter.InsertFromHtmlString(htmlStrings[i], settings);
			if (!converter.Convert(doc))
				{
				Log.Warning("PDF conversion failed.");
				if (!EnvironmentHelper.IsProd())
					{
					try
						{
						string path = $"HTML2PDF/MyLog.txt";
						File.WriteAllText(path, converter.GetLog());
						}
					catch (Exception ex)
						{
						Log.Warning(ex, "Failed to write PDF conversion log");
						}
					}
				}

			combinedDoc.InsertPages(combinedDoc.GetPageCount() + 1, doc, 1, doc.GetPageCount(), PDFDoc.InsertFlag.e_none);
			watch.Stop();
			Log.Information($"Conversion of PDF section {i} took {watch.ElapsedMilliseconds} ms.");
			}
		}

	Log.Information("Total pages written to doc: {DocPages}", combinedDoc.GetPageCount());
	using (var ms = new MemoryStream(combinedDoc.Save(SaveOptions.e_compatibility)))
		{
		return PdfReader.Open(ms, PdfDocumentOpenMode.Import);
		}
	}

The error:
Even though HTML2PDF.IsModuleAvailable() returns true, the Convert() method always returns false. Checking the log file where I wrote the contents of converter.GetLog(), I see the following:

[0930/152404.660:WARNING:resource_bundle.cc(448)] locale_file_path.empty() for locale 
[0930/152404.691:FATAL:win_util.cc(814)] Check failed: false. : The operation completed successfully. (0x0)
Backtrace:
	CrashForExceptionInNonABICompliantCodeRange [0x00007FF71E9E1402+9602450]
	CrashForExceptionInNonABICompliantCodeRange [0x00007FF71E921DC2+8818514]
	CrashForExceptionInNonABICompliantCodeRange [0x00007FF71E939A95+8916005]
	CrashForExceptionInNonABICompliantCodeRange [0x00007FF71E93A8EC+8919676]
	CrashForExceptionInNonABICompliantCodeRange [0x00007FF71E93AC60+8920560]
	CrashForExceptionInNonABICompliantCodeRange [0x00007FF71EA097D9+9767273]
	GetHandleVerifier [0x00007FF7204BD612+17125762]
	CrashForExceptionInNonABICompliantCodeRange [0x00007FF71F1D4F22+17940658]
	Ordinal0 [0x00007FF71D38078F+31524751]
	Ordinal0 [0x00007FF71D37F1E6+31519206]
	Ordinal0 [0x00007FF71D37F67C+31520380]
	CrashForExceptionInNonABICompliantCodeRange [0x00007FF71E90C26F+8729599]
	CrashForExceptionInNonABICompliantCodeRange [0x00007FF71E90C0F0+8729216]
	CrashForExceptionInNonABICompliantCodeRange [0x00007FF71E90BE23+8728499]
	Ordinal0 [0x00007FF71B571032+4146]
	GetHandleVerifier [0x00007FF723735038+70044584]
	BaseThreadInitThunk [0x00007FFA710984D4+20]
	RtlUserThreadStart [0x00007FFA71941791+33]

Exit code: 0X80000003
Missing output file D:/local/Temp/pdftron/Trn-3112-1664551038-fe82bad4-83d9-4d38-b85b-df837fb28395

Additionally, if I use the Azure console to directly run html2pdf_chromium.dll, I get the same output as in the log file, except that the last 3 lines are missing. However, if I directly run html2pdf_chromium.dll on my local machine using the command prompt, I see different output:

[0930/113517.001:WARNING:resource_bundle.cc(448)] locale_file_path.empty() for locale
[0930/113517.020:INFO:content_main_runner_impl.cc(1172)] Chrome is running in full browser mode.

Some research I’ve done indicates that the Azure Web App environment blocks access to certain GDI functions: Azure Web App sandbox · projectkudu/kudu Wiki · GitHub Do you know if html2pdf_chromium.dll uses any of those functions? Have you ever tested it in an Azure Web App environment and gotten it to work? Alternatively, do you know if html2pdf_chromium.dll requires writing to the registry, or does so if the PDFNet license is a trial license (which it is in my case)?

Hi Andrew,

The html2pdf module is currently not working properly on windows with Azure. We are in the process of looking into the issue.

In the meantime, you can run the module in a linux container. We are currently writing a guide on how to do this. I will update you with the link when it is created.

Hi, kmirsalehi.

An update: Since you said that HTML2PDF works in Linux, we’ve tried creating an HTTP-triggered Azure function running on Linux and using the Linux version of the HTML2PDF module downloaded from here: PDFTron Systems Inc. | Documentation

Unfortunately, this has the same problem. We’re able to initialize PDFNet and it does find the HTML2PDF module, but Convert() returns false. In this case, converter.GetLog() does not return anything at all.

Do you have any ideas what could be going wrong?

Here’s our code:

public static class GeneratePdfFromForm
    {
        [FunctionName("GeneratePdfFromForm")]
        public static async Task<IActionResult> Run(
            [HttpTrigger(AuthorizationLevel.Anonymous, "get", "post", Route = null)] HttpRequest req, ExecutionContext context,
            ILogger log)
            {
            string logString = "";
            try
                {
                logString += ($"Current directory: {context.FunctionAppDirectory}" + Environment.NewLine);
                logString += ($"Bin contains: {string.Join(", ", Directory.EnumerateFileSystemEntries(Path.Combine(context.FunctionAppDirectory, "bin")))}" + Environment.NewLine);
                logString += ($"Current directory contents: {string.Join(", ", Directory.EnumerateFileSystemEntries(context.FunctionAppDirectory))}" + Environment.NewLine);

                PDFNet.Initialize("demo:1661197244915:7a08b7010300000000c9c111eee06c79cb17367c849b3ef077e448cb3e");
                HTML2PDF.SetModulePath(Path.Combine(context.FunctionAppDirectory, "bin/HTML2PDF"));
                if (!HTML2PDF.IsModuleAvailable())
                    {
                    return new OkObjectResult("Error: HTML2PDF was not initialized correctly." + logString);
                    throw new Exception("HTML2PDF was not initialized correctly.");
                    }

                string html = "<html><head></head><body>Hello World!</body></html>";
                byte[] result = null;
                using (var doc = new PDFDoc())
                using (var converter = new HTML2PDF())
                    {
                    var settings = new HTML2PDF.WebPageSettings();
                    settings.SetPrintBackground(true);
                    converter.InsertFromHtmlString(html, settings);
                    if(!converter.Convert(doc))
                        {
                        return new OkObjectResult($"Error: {converter.GetLog()}" + logString);
                        //throw new Exception("Failed to convert");
                        }

                    using (var ms = new MemoryStream())
                        {
                        doc.Save(ms, pdftron.SDF.SDFDoc.SaveOptions.e_compatibility);
                        result = ms.ToArray();
                        }
                    }

                return new OkObjectResult(result);
                }
            catch(Exception ex)
                {
                return new OkObjectResult($"Exception: {ex}, stack trace: {ex.StackTrace}" + logString);
                }
        }
    }

It’s important to verify the required dependencies are installed by checking the shared object dependencies with ldd. If any are listed as “not found” they must be installed. Please run “ldd” on your instance.

When copied to the host, usually “/home/site/wwwroot/bin/html2pdf_chromium.so”, the HTMl2PDF module will likely not have permission to run and you will need to grant it execution permissions. In a .NET application you can use the “Mono.Posiz.NETStandard” package to set file/folder permissions.

Thank you! The suggestion of using Mono.Posix to set the file permission of the html2pdf_chromium.so file before running it made the difference. I was able to successfully convert an HTML string to PDF!

Hi, unfortunately there is still a problem. The success I had only happens if I run HTML2PDF in a Linux Azure function that I created manually from Visual Studio Code’s Azure extension.

If I create a function through the Azure portal, or through command line tools/Octopus Deploy, then the resulting function app’s Linux OS is missing 3 of the needed libraries:

libnss3.so
libnssutil3.so
libnspr4.so

After some effort I was able to figure out how to build Docker containers and use them with Azure functions. I am able to define a Docker image that contains my function and automatically uses apt-get to add the missing libraries to the Linux OS, and I am able to push that image to a repository and include it in the function. However, HTML2PDF still refuses to work.

I am ensuring that the html2pdf_chromium.so file has the UserExecute file permission enabled. I am getting a true response from HTML2PDF.IsModuleAvailable(). I am having the function run ldd on both html2pdf_chromium.so and libPDFNetC.so immediately before using them; neither of them have any “not found” dependencies. However, converter.Convert() still returns false and converter.GetLog() doesn’t return anything.

My Dockerfile looks like the following:

FROM mcr.microsoft.com/azure-functions/dotnet:4
ENV AzureWebJobsScriptRoot=/home/site/wwwroot \
    AzureFunctionsJobHost__Logging__Console__IsEnabled=true

COPY ["./bin/Release/net6.0/publish", "/home/site/wwwroot"]
RUN apt-get -y update
RUN apt-get -y install libnss3
# RUN apt-get -y install libnssutil3 (This is not needed as it is part of libnss3)
RUN apt-get -y install libnspr4

Note that I have dotnet publish the solution to the “./bin/Release/net6.0/publish” folder as part of our build process and then have the Dockerfile copy the built function into the container as shown above.

Do you have any other ideas what could be stopping HTML2PDF from working? Also, do you by any chance have a known working minimal sample of a Linux function Docker image where HTML2PDF is used? If so, I could try building off of there to get this working as well as compare it with my image to see what’s causing the problem.

The only other thing I can think of is trying to run the function as “isolated process” instead of “in-process” just in case it helps, but it will require some changes to my startup code.

Hi Andrew,

Apologies for the late response, the only way to have the HTML2PDF to work on Azure on Windows is by running it through the command line using the --no-sandbox argument.

There currently is no way to have it function in the app service.

Hi. I just wanted to update this for the benefit of others with the same problem.

I found a way to run HTML2PDF in Azure! The key was to create an App Service Web API resource that uses a Windows Docker container specifically, and to use a full Windows base image (not nanoserver or servercore). This is what my Dockerfile looks like:

FROM mcr.microsoft.com/windows:1809
	
COPY ./bin/Release/net7.0/publish /app
WORKDIR /app

# Expose port 80
# This is important in order for the Azure App Service to pick up the app
ENV ASPNETCORE_URLS http://+80
ENV PORT 80
EXPOSE 80

ENTRYPOINT YourProjectNameHere.exe

Note: This is assuming that the C# project from which we call HTML2PDF is running on .NET 7 and gets published to the relative directory in the COPY instruction. The Dockerfile should be in the same folder as the .csproj file, and the HTML2PDF module should be included among the files that are output when the project is published.