Monday, 27 June 2016

The depressing feeling of the software slippery slope....

Today has been a day when I have implemented precisely nothing and have spent the day fault-finding and fixing two problems. I have only managed one of them but it is all too common in software, especially when you manage several different parts of the same system.

1) The web app errors
I came in to an email that our site was not working. Although it seemed to be working for me, the error emails (very useful, highly recommended!) told a different story. A part of our web service is supposed to speak to a read-only replica of our database as part of our move towards higher bandwidth on our site, obviously only read-only procs are supposed to use this connection otherwise you get errors (which we had). I had seen this before and it seemed weird since the code "clearly" showed that these methods were not calling the read-only replica so should not have produced the errors. After probably too long, I noticed that my lazy loading was "very lazy" and I had reused the same variable between the writable and the read-only databases which meant whichever one was called first became the connection and every subsequent database call would try and use it. Bah. But not too bad, at least it was an easy fix.

Except I had stepped onto the slippery slope

2) The updated project
Since this site was last deployed, we have been moving some of the projects from SubVersion to Visual Studio Team Services (online) which will hopefully track and streamline our deployment moving forwards. This meant that some of the projects that were referenced in my solution were now referenced as NuGet packages, which is much neater and pretty sweet. What I hadn't realised was quite how bad some of my old library references were. Not only was I referencing projects, but also dlls in other projects on disk and also some really weird references like something that used to exist in my downloads folder. The new online projects were good because they were building online without any nasty physical references but not all missing libraries will cause a build failure. My web service had already been updated and I rather optimistically deployed my fix with the project reference changes and it failed online.

You would think the library loading errors would be more useful by now but nope. Cannot load Security.Cryptography or one of its dependencies, a strong named assembly is required. Well my calling library is strong named - I checked. Security.Cryptography doesn't seem to have any non-system dependencies so what on earth is going on? Trying to roll the project back was a pain since I had deleted the original folders once the VSTS projects had built because I hadn't changed the code and didn't need the old copies any more.

About two hours of fiddling, rebuilding, debugging, enabling remote desktops and faff later, I narrowed it down to one library and restored the old project bindings just to get it working. So now would be a good time to find out exactly what is wrong by using a local test server and fix all the dependencies.

3) The broken web publish in Visual Studio 2015
The NuGet package system requires Visual Studio 2015 for some reason so I opened my project in VS2015 (it was 2013 originally), built it and clicked "Publish". Bang. VS crashed. Followed some online suggestions. Clean. Crash. Delete profiles and recreate. Crash. Save the profile before deployment, choose publish again, crash. Debug in another Visual Studio and there is some dirty error about a missing method in the publish mechanism. Like most visual studio searches, there are about a trillion different forum questions about everything and about 100 different answers which might or might not work including loads of "this worked for me". Still nothing. In the end, I had to use VS2013 to publish it - didn't have any more will to try and get help.

The slope was still slippery though

4) The case of the missing database tables
I have a task that runs daily to backup our live database, download it to a test server and install the most recent one as a test copy of the database. It also runs some statistics processing - number of users etc. but at some point recently, it had obviously stopped working, the database was being created but none of the data was being unloaded from the bacpac.

The error log was hopeless, just another generic SQL Server error - I have no idea why it couldn't have logged the actual (and very simple) SQL error that was occurring when SqlPackage was being run: "You can only create a user with password in a contained database".

Of course you cam, and our database is contained, which is why it contains the contained user in the first place - something we learned the hard way that you need for geo-replication. Somehow the bacpac file did not contain the contained setting which caused the failure. Head is hurting now.

5) How to export the bacpac with the correct option
A Google search revealed a few other people had the same issue and a lovely comment from MS that since it only affects contained users, it only affects a few people and will be a low priority. Anyway, it was allegedly fixed according to the Connect issue although very lazily, there were no details about when/how/where it was fixed. Clearly it didn't work using the horrible old web service method of database backups which was basically a Soap web service.

I clearly needed to use a newer method which was much more likely to work correctly. There are two ways that you can allegedly work with SQL databases from .Net (3 if you count the "classic" method) and they are 1: REST API and 2: .net libraries. Both techniques are woefully un-documented. I spent about 2 hours trying to get the REST API to work since the current app uses an API style syntax. Nothing. I eventually managed to get a 401 with the detail that "the Authorization header is not present". Really interesting since it is COMPLETELY absent in the API documentation, not just what credentials are supposed to be used but even that is required at all! OK, I'll assume the response is correct and I need an authorization header.

Whoops, the slope is still descending

6) The HttpWebRequest class in .Net is often used for API connections. It appears pretty obvious to use (although several people online recommended some better third-party alternatives). So what do you do? You set request.Credentials. OK, so I'll try that. Nope. PreAuthenticate = true? Nope, that doesn't pre-authenticate, it only caches creds for subsequent calls to the same url. Now what normally happens is that if a protected resource is requested, the first response is 401 (Not Authorized) which MUST include a header that says what Auth mechanisms are acceptable. The client can then choose one, send it back in the correct format and the second request will then return the correct response if authentication is successful. Except it wasn't working. Does the HttpWebRequest do the 2 calls automatically under the covers or do I need to catch the first 401 and call it again? Couldn't find an answer to that which makes me think that it should. Except it wasn't - or at least I was still getting 401. Then saw something weird about the web client not handling 401 if it comes back in json format rather than plain html.

I'm tumbling down now, the top of the slope is far away

7) Fiddler problems
Fiddler is a great tool but it is getting harder now with SSL sites because although Fiddler should be able to decrypt SSL, it didn't work with this API request - it just caused the Trust to be broken by the API connection even though I had trusted Fiddlers root certificate. So I can't even see what is happening - whether there are two requests or whether something else weird is going on. Maybe the MS API is not returning the headers correctly so that the client can see them.

This was feeling like a real dead end so perhaps the .net libraries would be an easier way to perform the same job.

8) .net documentation is like a Brutalist housing estate
Most of you who program in .net have experienced the sinking feeling in amongst the vast, vast reams of .net documentation. You have experienced lack of examples, lack of complete documentation, lack of documentation of all possible return values, all errors even basics like like the specific format of a parameter rather than something useless like "the azure server name".

The documentation for .Net SQL Azure access is appalling. It is a vast list of auto-generated documentation. No guidance on the front page, no list of basic tasks just hundreds of classes. Basically unusable.

Then I tried finding the front page of Azure documentation and again, nothing under "Develop", nothing under "Manage" and no real thought into moving away from these useless abstract verbs and into more concise groupings that will allow me to find what I want.

9) I am dead
My system doesn't work, I have exhausted all my avenues right now so unless I get some help on Stack Overflow, I might have to download the backups and not process them until such a point as I can afford someone in my team to spend a week getting it to work again.
Post a Comment