Purpose

The purpose of this document is to provide steps and suggestions in order to troubleshoot iOS enrollment or MDM connection failures using Microsoft Sysinternals DebugView application.

Prerequisites

DebugView

In order to troubleshoot using DebugView you need to first download the application.  You can do that here:  https://technet.microsoft.com/en-us/library/bb896647.aspx. This application is simply run directly from the downloaded ZIP file or from the folder into which the ZIP has been extracted.  No installation is necessary.

Configuration Files

Next, you’ll need to enable the debug options for the Afaria Enrollment Server.  This is done in three potential places.  The first place is in the “Web.config” file located in the “aips” directory in the Enrollment Server’s installation path.  By default this is “C:\Program Files\AIPS”; however, this location can be changed during installation.  Additionally, any server that was upgraded from a previous version with a different default path might have a folder named “iPhoneServer” instead of “AIPS”.


From the “aips” folder, edit the “Web.config” file in any plain text editor (e.g. – Notepad).  Locate a line with the key “LogDebugInfo” in the “<appSettings>” section.  This is what it looks like by default:


<appSettings>

<!– <add key =”StaticFilePath” value =”C:\tmp\aips”/> –>
<!– <add key=”LogDebugInfo” value=”yes” /> –>

</appSettings>

The character strings “<!–“  and “–>” before and after any amount of text will cause the parser to treat that text as a comment.  To enable debug, we need the parser to actually read this key as code. To do that we will remove the comment strings out.  The result looks like this:


<appSettings>

<!– <add key =”StaticFilePath” value =”C:\tmp\aips”/> –>
<add key=”LogDebugInfo” value=”yes” />

</appSettings>


Note 1: The “aips2” application directory in IIS points to the same physical path.  So this modification affects requests to “aipService.svc/aips” and “aipService.svc/aips2”

Note 2: The “StaticFilePath” key is used for another purpose that’s outside the scope of this document.

The next file is the “Web.config” located in the “aipsr” folder.  This folder won’t exist for earlier versions of Afaria 7.0 or any 6.6 version.  You can skip this file for those versions.  This file by default has no entry at all for the “LogDebugInfo” key.  The “<appSettings>” section looks like this:

<appSettings>

<add key=”ValidationSettings:UnobtrusiveValidationMode” value=”None”/>

</appSettings>

We need to add the key so it looks identical to the one in the “aips” folder.  The result is this:

<appSettings>
<add key=”ValidationSettings:UnobtrusiveValidationMode” value=”None”/>
  <add key=”LogDebugInfo” value=”yes” />
</appSettings>

Lastly, we need to edit the “AfariaIphoneServer.exe.config” file located in the “Bin” folder.  In this case, the “LogDebugInfo” key does exist but instead of being commented out, it’s simply set to a value of “no”.  You need to just update it to a value of “yes”, matching the other files edited so far.

Once all three (or two) files are updated, open the Windows Services control panel (Start > Administrative Tools > Services) and restart the “AfariaIPhoneServer” service.

Configuring DebugView

Once debug is enabled you will need to open DebugView (“Dbgview.exe”) from the folder or ZIP file where you downloaded it.  Be sure to right-click on Dbgview.exe and choose “Run as Administrator” to ensure the process has permissions to capture debug output.  This is the interface for DebugView:

DebugView_1.png

The default options are lacking one critical selection. From the “Capture” menu, select “Capture Global Win32” to add the selection to the default options.

DebugView_2.png

You may also wish to enable one or two other options. From the “Options” menu, the “Clock Time” option will change the display of the “Time” column from showing the number of seconds that have elapsed since DebugView was opened to showing the local clock time.

The other option is the “History Depth” option from the “Options” menu.  This brings up an interface that looks like this:

DebugView_3.png

This option allows DebugView to hold a maximum number of rows at any time.  Once the maximum is reached, the oldest records will be purged as new ones are added. This option is very useful when you will be running DebugView for a long time waiting for a problem to occur and don’t want the process to consume unlimited amounts of system RAM.

Additional Notes for DebugView

At this point, you should be collecting debug output. Note that the “Capture Events” option (“Ctrl+E” keyboard shortcut) from the “Capture” menu is a toggle for enabling or disabling active collection of data from the debug store by DebugView. Also, you can clear the display using the “Clear Display” (“Ctrl+X” keyboard shortcut) selection from the “Edit” menu. You can also expand the “Debug Print” column to the maximum width of the window to display more information on-screen.

To save the log file at any point, choose “Save As…” from the “File” menu.  The default name is “<MachineName>.LOG”.

 

General Guidelines

In general, DebugView provides us with whatever our developers have coded the product to output to the Windows debug store. For the Enrollment Server, this is actually quite a bit of information.  In many situations, you can find something useful by comparing a log from a successful connection to a log from one that failed.  Additionally, if you see nothing in the log that’s associated to a device attempting to connect while other information is being logged, you can generally assume the device is not communicating with the Enrollment Server.

Background Processes

Even without device connectivity, you will see some output during normal background processing by the Enrollment Server.  One such process is “DeferredDelivery”.  This process runs every two minutes checking for any policies added to the A_IPHONE_DEFERRED_DELIVERY table.  These policies are those that require a device be in “supervised mode” and can’t be delivered on the first push after enrollment. If one or more rows exist in the table then the server will send outbound notifications to all affected devices in order to deploy the deferred policies.  Here is an example of what this looks like:

00000007    11:07:24.701 AM   [12408] thread:8 time: 2015/03/13 11:07:24 <SQL performDeferredDelivery: select * from A_IPHONE_DEFERRED_DELIVERY << performDeferredDelivery.getDeferredDeliveryQuery

00000008    11:07:24.704 AM   [12408] thread:8 time: 2015/03/13 11:07:24  RunQuery return <<getDeferredDeliveryQuery  

00000009    11:07:24.704 AM   [12408] thread:8 time: 2015/03/13 11:07:24 performDeferredDelivery.DeferredDelivery[0]

00000010    11:07:24.704 AM   [12408] thread:8 time: 2015/03/13 11:07:24 performDeferredDelivery done at 3/13/2015 11:07:24 AM     

Another regular background process is the one that processes devices that have returned a “NotNow” response when attempting to perform an MDM command.  This process runs every 10 minutes and checks the A_IPHONE_MDM_QUEUE table for any device with a row where the MDMStatus value is “2”.  Here is an example of that process running:

00000020    11:13:23.907 AM   [12408] thread:8 time: 2015/03/13 11:13:23 <SQL postProcessQueue: select * from A_IPHONE_MDM_QUEUE where MDMStatus=@cwe_MDMStatus << postProcessQueue.getQueueNotNowQuery, Params: 2     

00000021    11:13:23.910 AM   [12408] thread:8 time: 2015/03/13 11:13:23  RunQuery return <<getQueueNotNowQuery  

00000022    11:13:23.910 AM   [12408] thread:8 time: 2015/03/13 11:13:23 postProcessQueue.NotNow[0]  

00000023    11:13:23.910 AM   [12408] thread:8 time: 2015/03/13 11:13:23 postProcessQueue done at 3/13/2015 11:13:23 AM    

Initial Steps

Immediately from these two examples, you can begin to see certain details emerging.  In many cases, the actual SQL query being run for these processes is detailed in the debug output, minus the actual parameter values.  In some situations, you may need to run these queries yourself against the database to determine what data is being returned to the Enrollment Server. You can also see the class and method being run.  This information may be useful when a bug is identified, as you can provide it in the bug report.

You can also see two critical pieces of information in nearly every row.  First is the process ID, surrounded by square brackets.  The second is the thread ID, prefixed by “thread:”.  In the example below, the process ID is 12408 and the thread ID is 8.

00000020    11:13:23.907 AM   [12408] thread:8 time: 2015/03/13 11:13:23 <SQL postProcessQueue: select * from A_IPHONE_MDM_QUEUE where MDMStatus=@cwe_MDMStatus << postProcessQueue.getQueueNotNowQuery, Params: 2     

Process ID

The process ID is important because it allows you to determine if the source of the entry is the Enrollment Server backend (AfariaIphoneServer.exe) , the IIS frontend (w3wp.exe) or another process entirely. You can see process ID values for running processes in Windows Server 2008 using Task Manager.  From the “View” menu you can choose “Select Columns” and then check the box next to “PID” like this:

This will then show the process ID in Task Manager for all running processes.

Thread ID

The thread ID is helpful when multiple connections are happening simultaneously.  You can follow a thread ID to ensure you’re seeing entries for the same connection. Two key points to note about the thread ID are 1) threads are reused often and 2) a single device connection requires at least one IIS (w3wp.exe) thread and one Enrollment Server (AfariaIphoneServer.exe) thread.  Those IDs are independent to the process.  In the cases where the thread ID between the processes is the same by coincidence, it doesn’t necessarily mean it’s for the same device connection.

When following a thread ID, be sure to watch the debug entries for that thread for any distinguishing details of the device. Particularly common is the ClientUID and UDID.  If that changes while the thread ID has remained the same, you can be sure that the thread has been recycled.

Process Flow

Tracking process flow is not always guaranteed to be possible, as not every individual row will have a thread ID or a distinguishing identifier for a device.  In a very busy system it can be incredibly challenging to follow the potentially hundreds of connections that are occurring simultaneously.  However, in general you can find the start and end to certain tasks and see when IIS hands off to the Enrollment Server.

When you see a row like the following, with “Request to…” in the text, you can generally assume it’s the beginning of a connection from a device for a specific task/method (“mdmRegister” in the example below):

00000232    11:54:54.191 AM   [13940] thread:3 time: 2015/03/13 11:54:54 Request to mdmRegister: http://

This should pair up with an entry in the IIS log (located in the C:\inetpub\logs\LogFiles\w3svc1 folder by default) such as this:

2015-03-13 15:55:15 192.168.1.100 GET /aips2/aipService.svc/mdmRegister id=qqoBAAAAAQ…

In general, using the IIS log to track method call flow and return codes (“sc-status” + “sc-substatus” columns in the IIS log) will help you better determine what to search for in the DebugView log.  An example flow from the IIS log is:

GET /aipsr/Register
GET /aipsR/login.aspx
POST /aipsR/UserPrompts.aspx
GET /aips2/aipService.svc/mdmRegister
POST /aips2/aipService.svc/BootstrapEnrollment
GET /aips2/aipService.svc/scep

When you see a row with “sendMessage” in it, like this example below, you can generally assume this is the end of the communication back to the device, including the HTTP status code returned:

00000280    11:55:14.868 AM   [13940] thread:3 time: 2015/03/13 11:55:14 sendMessage HttpStatusCode.OK      

Additionally, when the IIS frontend spawns a connection to the Enrollment Server backend, there is often a very specific format to identify this:

00000193    11:54:50.072 AM   [10424] thread:6 time: 2015/03/13 11:54:50 CreateProxyInstance registry endpoint:     

00000194    11:54:50.072 AM   [10424] net.tcp://localhost:7007/AfariaIphoneServer[2147483647]

More often than not, the row immediately following this entry will be the thread for the Enrollment Server that was spawned or reused.  You can use this to

then follow the backend thread ID for the continuation of the process.

For post-enrollment MDM communication, the commands are retrieved by the device over the “PutMdm” method.  You can search the log for those entries with “PutMdm Server Request” like this:

00000998    12:46:56.280 PM   [15968] thread:3 time: 2015/03/13 12:46:56 PutMdm server Request      

To search for the status of a device that responded to an MDM command, you can search for “deviceStatus.”, such as this:

00000964             12:46:53.764 PM              [15968] thread:3 time: 2015/03/13 12:46:53 >>> PutMdm : deviceStatus.Idle

Explanation of the status results are as follows:

Idle = Device has no current task or has finished a previous task and is ready for the next one

Acknowledged = Device has received a task and is preparing to execute it

NotNow = Device is unable to perform the requested task at this time (usually because it’s locked with a passcode)

Many tasks require the device to provide inventory XML data as a response.  You can verify that the Enrollment Server is processing the file and the associated task for that specific XML file.  Here’s an example of that for the InstalledApplicationList task:

00001089    12:46:56.975 PM   [12408] thread:8 time: 2015/03/13 12:46:56 processFiles : C:\Program Files\AIPS\Bin\Inv\ghcsmnp1.xml    

00001090    12:46:56.976 PM   [12408] thread:8 time: 2015/03/13 12:46:56 processFiles : xml InstalledApplicationList    

When searching, you should always find an occurrence of either the UDID or ClientUID of a device.  If you collect those details from the database and your search yields nothing in the DebugView log then it indicates the device never reached the Enrollment Server.

 

Specific Examples

Below are a few examples of where DebugView is helpful and how it can be used to confirm a known issue.  As you become more familiar with this troubleshooting tool, you’ll be able to use it to confirm known issues but also isolate causes of problems we have not seen before.

The SCEP Server returned an invalid response

This error is very common.  The general cause is that the response from the SCEP server was not a valid certificate file. This is either because an error response was returned instead of the file returned was not a valid certificate.

Beginning with Service Pack 2 for Afaria 7.0, the Enrollment Server is now a proxy for iOS enrollment certificate requests. This means that when a device requests a certificate via SCEP during enrollment the device believes the Enrollment Server to be the SCEP server.  The device submits the request and then the Enrollment Server submits a request to the Certificate Authority.

The error returned by the device can be the result of either the CA not returning the expected certificate to the Enrollment Server or the Enrollment Server returning an error.  If the CA returns an error when the Enrollment Server requests the certificate then it will return an error status back to the device.  To the device, both of these situations appear the same.  Also, even if Afaria is configured for a “Native mode” CA, the device still sees it as a SCEP server.

For this example, we know that the failure is in the SCEP request.  So we can begin by checking the IIS logs.  Searching for “/scep” we find two entries at the time this device attempted to enroll.

2015-03-13 15:55:55 192.168.1.100 GET /aips2/aipService.svc/scep id=<removed>?operation=GetCACert&message=LocalSCEP 80 – 192.168.1.100 profiled/1.0+CFNetwork/711.1.16+Darwin/14.0.0 200 0 0 1413

2015-03-13 15:55:58 192.168.1.100 GET /aips2/aipService.svc/scep … 80 – 192.168.1.100 profiled/1.0+CFNetwork/711.1.16+Darwin/14.0.0 400 0 0 2686

We can see that the first request returned a status of “200.0” and the second one returned a status of “400.0”, which is the HTTP error “Bad Request”.  We can then use that to search the DebugView log.

Device requests to the “/scep” method are logged with the text “Request to GetSCEP”.   So we search for that.  The first occurrence is this:

00000494             11:55:54.158 AM              [13940] thread:3 time: 2015/03/13 11:55:54 Request to GetSCEP: http://knvma70sqlany/aips2/aipService.svc/scep?id=<removed>?operation=GetCACert&message=LocalSCEP

00000495    11:55:54.170 AM   [13940] thread:3 time: 2015/03/13 11:55:54 parseSCEPrequest ret: True cmd: GetCACert    

00000496    11:55:55.532 AM   [13940] thread:3 time: 2015/03/13 11:55:55 getAfariaSCEP ret: True http: 200   

We can see here that this is the first request that returned a “200” response.  So we move on to the next one:

00000497    11:55:56.273 AM   [13940] thread:3 time: 2015/03/13 11:55:56 Request to GetSCEP: http://knvma70sqlany/aips2/aipService.svc/scep?id=<removed>?operation=PKIOperation&message=<removedCertData>    

00000498    11:55:56.274 AM   [13940] <contRemovedCertData>     

00000499    11:55:56.275 AM   [13940] thread:3 time: 2015/03/13 11:55:56 parseSCEPrequest ret: True cmd: PKIOperation 

00000500    11:55:56.737 AM   [12408] CertLib: csiLogOpen: Opened C:\Windows\certenroll.log     

00000501    11:55:56.822 AM   [12408] lookup Id: {415439d1-02de-4cbe-b1c2-c8df47b779bf}      

00000502    11:55:58.935 AM   [13940] thread:3 time: 2015/03/13 11:55:58 getAfariaSCEP ret: False http: 500      

Here we see the error that’s returned as “http: 500”, which is the status code for “Internal Server Error”.  This shows us that the request to the CA returned an error.  We can then check the CA for those errors.  We would check the CA’s IIS log as well as the Event Viewer Application log.  We would also check the CA to see if there were any failed or pending requests that match the time period when we received the error on the device.

In this case, we were able to locate the error in the Windows Event Viewer in the Windows Logs > Application log.  An error entry for the “NetworkDeviceEnrollmentService” source had the message “The password in the certificate request cannot be verified. It may have been used already. Obtain a new password to submit with this request.” This was used to determine that the server requires a challenge phrase but Afaria was not configured for SCEP challenge.

Enrollment Error – Invalid or Expired

When a user attempts to enroll, on the device they may see the error message “The enrollment information is invalid or has expired.”

When looking at the IIS log, we see the following:

2015-03-13 16:15:59 192.168.1.100 GET /aipsr/Register ID=x2yacei4lcr7ei&ClientType=-8&RS=/rs1/client/rs_client.dll/knunn.a7es/ 80 – 192.168.1.100 Mozilla/5.0+(iPhone;+CPU+iPhone+OS+8_1_3+like+Mac+OS+X)+AppleWebKit/600.1.4+(KHTML,+like+Gecko)+Version/8.0+Mobile/12B466+Safari/600.1.4 302 0 0 665

2015-03-13 16:15:59 192.168.1.100 GET /aipsR/login.aspx ReturnUrl=%2frs1%2fclient%2frs_client.dll%2fknunn.a7es%2f%2faipsr%2fRegister%3fID%3dx2yacei4lcr7ei%26ClientType%3d-8%26RS%3d%2frs1%2fclient%2frs_client.dll%2fknunn.a7es%2f&ID=x2yacei4lcr7ei&ClientType=-8&RS=/rs1/client/rs_client.dll/knunn.a7es/ 80 – 192.168.1.100 Mozilla/5.0+(iPhone;+CPU+iPhone+OS+8_1_3+like+Mac+OS+X)+AppleWebKit/600.1.4+(KHTML,+like+Gecko)+Version/8.0+Mobile/12B466+Safari/600.1.4 200 0 0 657

Both of these are expected results.  The request to “Register” redirects to “login”.  So there is a 302 redirect response and then a 200 OK response.  Checking the DebugView log we can see the following:

00000057    12:16:00.517 PM   [12408] thread:45 time: 2015/03/13 12:16:00 <SQL validateActivationCode: Select CreationDate, RegisteredUserName, ExchangeUser from A_EUSSP_CODE left join A_EUSSP_VARIABLES on A_EUSSP_CODE.ActivationCode = A_EUSSP_VARIABLES.ActivationCode where A_EUSSP_CODE.ActivationCode=@cwe_ActivationCode << validateActivationCode.get activation code creation date, Params: yace     

00000058    12:16:00.519 PM   [12408] thread:45 time: 2015/03/13 12:16:00 validateActivationCode: Activation code not allowed – code not found in database      

00000059    12:16:00.524 PM   [21056] thread:7 time: 2015/03/13 12:16:00 aipsR.getAfariaRegistrationData ret: False    

00000060    12:16:00.527 PM   [21056] thread:7 time: 2015/03/13 12:16:00 aipsR.Login.Logon_Click HttpStatusCode.PreconditionFailed 

00000061    12:16:00.600 PM   [21056] thread:7 time: 2015/03/13 12:16:00 aipsR.Login.Logon_Click Exception: Thread was being aborted.    

For this one, we can see that the activation code provided was not found in the database.  That means either it was already used or the database was not properly updated for some reason.  In this case, it was the former.  The user had already used the enrollment URL and had a failure during the first attempt. He then tried to use the same URL again. Generating a new enrollment URL will allow enrollment to complete.

Message Signing Validation Failure

In the recent hot fix releases for SP4 and all releases of SP5, iOS devices are required to sign their MDM communication using the device identity certificate issued during enrollment.  This means that all requests to the “PutMdm” method or “TokenCheckin” method require signed data.

During enrollment, the device notifies Afaria of its MDM token via the “TokenCheckin” method.  If the packet isn’t signed, the device will see “Profile Installation Failed”. In the device log, you would find the error “A connection to the server could not be established.”  Additionally, a device that was enrolled prior to an upgrade that includes this signing requirement may attempt to connect to the “PutMdm” method and not sign the packet.

As a result of both of these scenarios, you would see errors like the following in the Server > Messages log in the Afaria Administrator:

IPH6034: Token Checkin MDM Message Signing Validation Failure. UDID: <UDID>

IPH6035: PutMDM Message Signing Validation Failure. UDID: <UDID>

Profile Installation Failed:

For the scenario of enrollment failure, it’s the TokenCheckin which fails.  Searching the DebugView log we would find an entry like this:

00000720    72.54165649 [3964] thread:39 time: 2014/09/10 16:06:49 PUT TokenCheckin Request      

Immediately below this we would see where it tries to validate the provided signature.  In one case you may see something similar to this:

00000721    72.54244232 [3964] thread:39 time: 2014/09/10 16:06:49 >>> getSignedMDMpacket : Mdm-Signature    

00000722    72.54549408 [3964] thread:39 time: 2014/09/10 16:06:49 getSignedMDMpacket msPkcs7Io.verifyDetachedSignature : msPkcs7Io.verifyDetachedSignature.x509: 

00000723    72.54549408 [3964] CryptographicException SignedCms.CheckSignature:   

00000724    72.54549408 [3964] The revocation function was unable to check revocation for the certificate.     

This shows that while checking the revocation of the certificate, that we received an error back from Windows (.NET).  This explains why the device sees the error.  You would need to investigate the ability of the Enrollment Server to check revocation for the device identity certificates.

Another possible option is like this:

00000662    14:25:43    [17360] thread:10 time: 2015/01/20 14:25:43 getSignedMDMpacket no Mdm-Signature

  

In this instance, you can see that the packet coming from the device had no signature data at all.  So either the device wasn’t signing the payload or something else is modifying the packet data before it reaches the Enrollment Server.

When signing is correct, you should see an entry like this:

00000732    12:46:46.909 PM   [15968] thread:3 time: 2015/03/13 12:46:46 TokenCheckin dev: <UDID> Signed: True SignatureVerified: True

Existing device not receiving policy:

The initial symptom for this is usually just that existing devices are not receiving policy updates or MDM delivered apps.  Upon checking the message log, you find the IPH6034/IPH6035 errors during the outbound notification time period.

Since MDM commands are retrieved by connection to the “PutMdm” method, you can start by searching for that.  Here’s an example where the signature doesn’t exist:

00000517    6.93393469  [2936] thread:97 time: 2014/12/15 10:49:41 PutMdm server Request      

00000518    6.93432140  [2936] thread:97 time: 2014/12/15 10:49:41 getSignedMDMpacket no Mdm-Signature  

00000519    6.93440151  [2936] thread:97 time: 2014/12/15 10:49:41 getSignedMDMpacket Elapsed time: 00:00:00.0000914 

00000520    6.93459320  [2936] thread:97 time: 2014/12/15 10:49:41 PutMdm dev: <UDID> Signed: False SignatureVerified: False

    

Troubleshooting this would be the same as the “Profile Installation Failed” situation above.  In both cases, if no signature exists then either the device is not signing the communication or the packet data is being modified between the device and the Enrollment Server.  If the revocation check fails then it will be logged in DebugView.

Final Notes

It is good to become familiar with the general flow for a successful enrollment.  Knowing how that looks in the IIS log and what to expect in the DebugView log will help you identify problem areas.  Both the DebugView log and the IIS log are important details to collect with any new problem in communication between a device and the Enrollment Server.  Remember, however, that DebugView log entries will be in local server time while IIS logs are in UTC.

It is possible that simply searching for “error” or “exception” can help you isolate problems but often the issues are more subtle. In general, you should be able to isolate an error somewhere in the process for the debug log to be useful. So if no error is obvious in the debug log, then one may exist as an error return code in the IIS log or as an error on the device itself.

Special thanks to Keith Keith for his agreement to publish thid document and his great contribution to troubleshoot many Afaria incidents.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply