Status of OpenChrome DRM (March 2018)

March 2018 was a huge month for OpenChrome Project.

I was able to finally fix the long standing (15+ months . . .) bug of X Server crashing when the screen resolution is changed during runtime. In addition to that, I was also able to fix two hardware cursor related issues.

I will start working on adding support for VIA Technologies VT1632(A) and Silicon Image SiI 164 DVI transmitters during the month of April. After those devices are supported, OpenChrome DDX’s UMS (User Mode Setting) and OpenChrome DRM’s KMS (Kernel Mode Setting) will have comparable device support. That will bring it closer to maintaining within the Linux kernel.

In addition to that, drm-next-4.17 branch pulled in Linux 4.16 rc7 kernel. OpenChrome DRM will not compile for now due to the changes made to GEM and TTM. I hope to resolve this issue soon.

Mostly fixed OpenChrome DRM hardware cursor (with a minor regression . . .)

While it is true that resurrecting ADATA Ultimate SU800 SSD 256 GB model took some time away from OpenChrome DRM development, I have also been working on fixing hardware cursor related issues. I tried various schemes, but I still could not get hardware cursor to behave correctly when I was in a dual head mode.

Eventually, I came up with an implementation of turning off the hardware cursor during mode setting, but turning it right back on when the associated input device is touched. While this implementation is not perfect, it actually has a “nice” side effect of not leaving the hardware cursor on after switching to VT (Virtual Terminal). The downside to this implementation is that right after standby resume, the hardware cursor will be turned off, but again, if you move the associated input device, it will come back on.

I also figured out the reason why I was seeing a corrupted hardware cursor on the screen right after standby resume, and put in a fix for it.

So now, I resolved two issues of the TODO list. I will start working on supporting external DVI transmitters (SiI 164 and VT1632(A)) next.

Updates on ADATA Ultimate SU800 SSD on HP 2133 Mini-Note

Okay, I have some updates on ADATA Ultimate SU800 SSD since I upgraded the firmware. Unfortunately, the firmware upgrade did not fix the issue of the SSD suddenly disappearing during heavy use. Since the previous post, I have had numerous cases of the SSD disappearing during Linux kernel compilation. Although it is possible that faulty hardware can be the case, I suspect the issue to be firmware or heat related.

These are some strange behaviors I have seen from ADATA Ultimate SU800 SSD 256 GB model.

  • It happens randomly (usually when it is subjected to heavy use)
  • The device remains unresponsive even after a cold boot if the computer is turned back on within 10 seconds (it is okay if I keep off for more than 15 seconds)
  • Have experienced the SSD getting lost after standby resume multiple times

Using a disk utility, the SSD reported temperature to be around 50 degrees Celsius for HP 2133 Mini-Note. Please note that I have not experienced any noticeable data loss. It appears that Micron Technology’s floating gate based 3D NAND Flash memory is holding up pretty well. I think the issue might be with Silicon Motion’s SSD controller.

My personal suggestions to people thinking of buying ADATA Ultimate SU800 SSD is to stay away from this model. I do not think it is worth the trouble I have experienced with it.

Struggling with ADATA Ultimate SU800 SSD on HP 2133 Mini-Note

For the past few days, I have been taking a break from OpenChrome development. This is because I have been getting annoyed by my ADATA Ultimate SU800 SSD occasionally failing to come back to life after standby resume, so I decided to apply  firmware upgrades to the SSD.

Going back to late Year 2016, I purchased two ADATA Ultimate SU800 SSDs at Fry’s Electronics since they were being sold at very low price points (128 GB model for $30 after mail in rebate and 256 GB model for $50 after mail in rebate). I had no problems getting both rebates back after 3 months, but the problem was that 256 GB model never worked very well. It will occasionally disappear as a storage device after some minutes of use (30 minutes to a few hours; sometimes it won’t even boot Xubuntu), and this pretty much ensures that I need to hard power off the computer every time this happened. The problem is, this happens pretty randomly, so at some point, I pretty much had to abandon the use of the 256 GB model for OpenChrome development purposes. The 128 GB model worked better, but once in a while (i.e., once a week), the OS will no longer see the device after I resume from standby.

Under normal circumstances, most people would have returned the product to the retailers right away. In my case, I had several reasons why this did not happen.

  1. Bought the item right before a month long trip (I did bring it with me.)
  2. It was an open box item (the only item available)
  3. Did not notice the stability issue right away
  4. The “Warranty Void If Removed” sticker was damaged by the previous owner (this is the risk of buying an open box item)
  5. Did not feel like dealing with the RMA (Return Merchandise Authorization) process

Maybe I got stuck with this open box item because the previous owner returned it because it did not work . . ., and Fry’s Electronics just returned the item to the shelf. Since the item came with a 3 year warranty, I did finally contact ADATA technical support for an RMA around late February 2018. I got the RMA form and filled it out.

Since it does cost postage to ship the item back, I decided to give a firmware upgrade one last chance before shipping it back. Regarding the firmware for ADATA Ultimate SU800 SSD, there are two known versions available as of this writing.

  • Version Q0125A available since March 2017
  • Version Q0518BS available since late December 2017

When I think about it, these release timings were pretty bad. When I first tried the 256 GB model, I gave up on it around late February 2017. Almost right after that, the first firmware upgrade was released (Version Q0125A). The second time I tried it, it was around mid-December 2017. At that time, I knew about the first firmware upgrade, but it was not clear which issues the new firmware version fixed, so I did not really think about the upgrade. Furthermore, I did not have access to a Windows 10, Windows 8.1, or Windows 7 computer during that time.

Anyway, I finally got tired of the ADATA Ultimate SU800 SSD issues even with the fairly functional 128 GB model, so I decided to upgrade the firmware of both models. To do this, I dragged out a Windows 7 computer I have not used for more than a year. The firmware upgrade for 128 GB model went without a hitch. Please note that I performed two upgrades. First, the original firmware is upgraded to Version Q0125A. Then, the unit is powered down. Power on the computer again and upgrade the firmware to Version Q0518BS. For the 256 GB model, the upgrade to Version Q0125A went without a hitch, but when I tried to upgrade to Version Q0518BS, the initial attempt to upgrade the firmware failed. I powered down the computer, reconnected the SSD to a different SATA port, and tried the firmware upgrade again. This time the firmware upgrade succeeded.

Since the next step for 256 GB model was an RMA return, I decided to test the model to see if anything has changed. I find compiling the Linux kernel to be a good way to stress test a piece of hardware, so I decided to compile drm-next-4.17 branch of OpenChrome DRM. Since I am doing this on HP 2133 Mini-Note, it takes about 20+ hours to do this (due to the very slow VIA Technologies C7-M 1.2 GHz processor . . .). To my surprise, the unit has not had the same issue of the SSD suddenly disappearing during normal operation that plagued the unit back in February 2017. I have not finished the compilation yet, but I am surprised that ADATA Ultimate SU800 SSD is working so well with just two firmware upgrades.

I also retested the 128 GB model and it appears to be working fine. The jury is still out on how it will behave after standby resume, but so far, I have not seen the issue. That being said, I usually need about 2 weeks of heavy use to see if the issue is still there. That’s it for my review of ADATA Ultimate SU800 SSD. If you are struggling with the SSD’s stability, you will definitely like to apply the firmware upgrade as soon as possible.

Other OpenChrome DRM issues I will need to take care next

Now that basic functionalities like standby resume and runtime screen resolution change are working reliably, these are the remaining issues of OpenChrome DRM I need to take care of.

I will categorize the issues by type.

Device Support:

  • Add support for VIA Technologies VT1632(A) TMDS transmitter for DVI
  • Add support for Silicon Image SiI 164 TMDS transmitter for DVI

 

Regression:

  • Temporary corruption of hardware cursor right after standby resume (annoying)
  • Corrupted display when VX900 chipset’s integrated HDMI is connected to an HDMI TV

 

Undesirable Behavior:

  • Hardware cursor stays on after switching to a VT (Virtual Terminal) screen

 

Code Development:

  • Replace the previous developer written GEM / TTM code with a newly rewritten one

 

Administrative Issues (Linux kernel tree mainlining issues):

  • Convert file names starting with via_*.* to openchrome_*.*
  • Convert functions and variables starting with via_* to openchrome_*
  • Convert white spaces to tabs
  • Removal of fencing related code written by the previous developer

If I can get through these issues, then I think OpenChrome DRM is ready for Linux kernel mainline inclusion. If the DRM maintainer insists on universal plane and atomic mode setting support, then this is when I will go implement them.

Finally . . ., fixes for runtime screen resolution change crash when KMS is activated

It took me more than 15 months to fix this bug, but I finally accomplished it! I am happy to announce now that OpenChrome can now handle runtime screen resolution change when KMS (Kernel Mode Setting) is being utilized.

Previously, this only worked reliably when OpenChrome DDX was performing UMS (User Mode Setting), but due to many Linux distributions not automatically installing UMS only DDX when installing the OS, it has been a very high priority for the OpenChrome Project to get KMS working, so that OpenChrome does not get treated like a second class citizen in the Linux world.

Again, I will have to go into my usual “blaming the previous developer” commentary, but KMS runtime screen resolution change functionality was already broken when I inherited the project. In fact, UMS side was broken as well. I did fix the UMS runtime screen resolution change crash with OpenChrome DDX Version 0.5.

Anyway, for KMS, I always believed that there was something wrong with OpenChrome DRM itself, so until very recently, my focus was to find the bug inside OpenChrome DRM. Eventually, I had to conclude that there was nothing wrong with OpenChrome DRM, and started to suspect DDX was at fault. The first time I got this confirmation was on March 2nd, 2018, but due to the regression I caused with the FP, I had to make fixes to OpenChrome DRM code first before I can fully test the runtime screen resolution change functionality. OpenChrome DRM Version 3.0.78 fixes the FP detection regression I have observed, and without this fix, crashes involving the X Server still occurs. Now that OpenChrome DRM’s issue with FP is fixed, I updated OpenChrome DDX’s code. OpenChrome DDX Version 0.6.171 fixes this nasty bug.

I only tested the code fixes with HP 2133 mini-note on Xubuntu 16.04.4 LTS so far, but as far as runtime screen resolution change is concerned, KMS based one has about the same stability as the legacy UMS based one. Of course, standby resume was fixed in October 2017, so now, I finally have a stable desktop environment where basic functionality like runtime screen resolution change or standby resume no longer crashes the computer regularly.

What this means is that, OpenChrome DRM getting into the Linux kernel mainline tree by end of Year 2018 is a realistic possibility as long as the DRM maintainer will not mind accepting code that does not support universal plane and atomic mode setting. If those two have to be implemented, I think it can take another 6 months or so.

I also feel vindicated with my development approach of OpenChrome DRM. Back in September 2017 over at XDC2017, one senior developer who works at one tier-1 semiconductor company very strongly urged me to immediately convert the mode setting code to support atomic mode setting. If I were to describe what I mean by “strongly urged,” it was almost like a zeal of an evangelist in how this person wanted me to convert the code to support atomic mode setting. I politely turned this idea down for following reasons.

  1. I only got OpenChrome DRM ported to Linux 4.13 for about a week prior to XDC2017
  2. At the time, OpenChrome DRM did not contain the better FP mode setting code proven with DDX’s UMS code path (again mainly due to point 1)
  3. Overall, OpenChrome DRM was at something of a “bring up” level state, therefore, I was not willing to go implement new functionality like atomic mode setting until things stabilized

The discussion about this issue continued over at the bar where the after party was being held (downtown Mountain View, California). As one can imagine, the senior developer who works at one tier-1 semiconductor company held on to the same view (immediately convert to atomic mode setting) and I was not willing to give in regarding this issue due to the above reasons. Another developer with a major Linux distribution company who works on the graphics stack joined the conversation, and it was essentially the senior developer who works at one tier-1 semiconductor company trying to double team me regarding the atomic mode setting issue. I had to rephrase the above point 2 by noting that OpenChrome DRM currently does not have the improved FP detection code that uses strapping resistors to detect the FP type, and I wanted to port that code first before venturing into the risky atomic mode setting code conversion. The reaction I got from this senior developer was a little surprising. “What is strapping resistors?” Since this senior developer worked at a tier-1 semiconductor company and worked on the Linux graphics stack, I assumed that someone like him at least has some rudimentary understanding of the underlying graphics hardware. I did explain to him how some graphics cards use a few resistors at a power on time to learn the basic configuration of the hardware. VIA Technologies Chrome IGP does this as well, and one of the reason why OpenChrome appears to handle standby resume a lot better than it did in the past is because I started to look at the strapping resistors to figure out the hardware configuration when performing the mode setting. This is why your display is working after standby resume with the fairly recent version of OpenChrome as opposed to older versions (Version 0.3.3 or older) I was not involved with. The other developer did not appear to know what strapping resistors are. I guess this is what happens when software only type people work on device drivers (Note: I come from hardware engineering background.).

I guess this sort of thing happens at large corporations where people only know their own section, and have to rely on others for portions they do not know. I just wished this person was willing to listen to my concerns about this risky and time consuming code conversion work I will have to perform with atomic mode setting support when I only had OpenChrome DRM running on Linux 4.13 for about a week. OpenChrome DRM at that time had “bring up” level code reliability, so I felt doing this kind of code change at that time was not prudent.

Since then, I fixed many issues within OpenChrome DRM, and finally yesterday, I figured out why runtime screen resolution change was crashing the X Server. Now, OpenChrome DRM based mode setting (KMS) appears to be as reliable as OpenChrome DDX’s more mature mode setting (UMS). Looking back, I largely followed what I said during the presentation about OpenChrome DRM, and I am glad the strategy worked.