<![CDATA[Chuan Ji]]>https://jichu4n.com/https://jichu4n.com/favicon.pngChuan Jihttps://jichu4n.com/Ghost 5.32Thu, 09 May 2024 06:26:45 GMT60<![CDATA[To Create]]>I was watching my son doodle on my tablet the other day. He recently learned how to draw a closed shape, and he was so proud of it. Drawing one blob after another, he excitedly proclaimed each blob as a pumpkin, a monster, a car, and so on and on.

]]>
https://jichu4n.com/posts/to-create/63b54b081372b200012fd853Wed, 04 Jan 2023 10:08:29 GMTI was watching my son doodle on my tablet the other day. He recently learned how to draw a closed shape, and he was so proud of it. Drawing one blob after another, he excitedly proclaimed each blob as a pumpkin, a monster, a car, and so on and on.

As I watched, I remembered how much my younger self also loved to create. To write. To compose. To code. Looking back now, I was mediocre and a lot of what I created was honestly pretty embarrassing. Clumsy. Derivative. Trite. Cheesy.

But there's something magical about the act of creating something. There's that special kind of joy, that almost sublime sense of fulfillment when you bring something new into being. In the words of Genesis — to see “every thing that he had made, and, behold, it was very good.” To quote Carleton Noyes in The Gates of Appreciation:

In the exercise of brain or hand, to feel the work take form, develop, and become something, — that is happiness. And the joy is in the creating rather than in the thing created; the completed work is behind us, and we move forward to new creation. A painter's best picture is the blank canvas before him; an author's greatest book is the one he is just setting himself to write. […] The impulse to expression is cosmic and eternal.

And that impulse pushed me to learn and grow. All those stories and essays certainly helped with my SATs and college, no matter how clumsy and derivative the product. All the late nights coding and debugging certainly gave me a head start in computer science and a career in software engineering.

But over the past several years, I’ve noticed that it’s become harder and harder to stay creative in my personal life.I feel lucky that my professional career so far has provided me a lot of space for creativity. Outside of work, however, my drive to do anything creative has slowly atrophied.

Why?

The first reason is time and energy. A lot has changed in my life in the past few years. At home, I’ve been blessed with two beautiful little children of my own. At work, I’ve taken on increasing responsibilities as a tech lead and manager. Of course, I love my children to death and I feel very grateful to have had the professional opportunities that I did. But these life changes have also meant that I have significantly less discretionary time at my disposal. Any such time often has to come at the cost of family time and other responsibilities. And it's often fragmented, and at inconvenient hours when I’m already tired or distracted.

The second reason is toxic perfectionism. As I’ve gotten older and “wiser”, it’s increasingly hard for me to be impressed by what I’m capable of creating myself. For example, if this essay will not approach the level of insight and impact of, say, one of Paul Graham’s, and will not make money and help pay off my mortgage, why bother writing it at all? Several times over the past few years, I'd get the urge to pick up the metaphorical pen and sketch out a blog post, only to then re-read it later and find it just not interesting enough, or polished enough, or otherwise worthy enough to publish. As Picasso famously said, “Every child is an artist. The problem is how to remain an artist once he grows up.

The result is that I find myself spending all my downtime consuming. Absent-mindedly reading the news. Doomscrolling on Twitter. Impulsively clicking yet another recommended video on YouTube. Shopping for stuff I know I don’t honestly need-need. I try not to think about the growing list of projects I want to do but never started or finished, or the ideas gathering dust in my drafts folder. It certainly feels like I’ve been stagnating, not getting any better at anything or learning new things.

And then occasionally I’ll come across something on the Internet where I’d chuckle to myself, thinking “lol what is this, I bet I could’ve done a better job.” And then I’d always kick myself mentally because, well, at least someone had the drive and the courage to create and publish it, while I’m just here scrolling away.

I’ve decided that I want to change. I want to find the drive to create again. Just like it did in the past, I want it to help me continue to grow personally and professionally, and I trust it will be a reward unto itself. But this time, I also want to model what I would like to see from my children. I want them to also experience that joy of crafting something of their own, and grow up passionate, curious, and self-driven.

Here’s my plan:

I’ll need to better manage my time and energy. The reality of having young children is not going to change for me anytime soon, so I need to make the best of the time that I do have. Some specific thoughts:

  • Consciously plan for personal time — As the Merovingian said in The Matrix Reloaded: “If we do not ever take time, how can we ever have time?” But I’ve been honestly scared to take time for myself. I’d feel very guilty towards my children that I’m not spending as much time as possible with them, or that I’m pushing responsibilities onto my wife and grandpa / grandma. But there is a healthy balance that I can find. Get up an hour before the morning routine. Plan ahead and arrange “shifts” with my wife and grandpa / grandma. Accept that some independent play time, or an extra hour at daycare, or maybe a little too much TV here and there is really fine. (”You need to chill the f* out sometimes”, says my wife.)
  • Set goals and timelines — I’ll need to compensate for the limited and fragmented time by consciously setting goals and timelines, just like how we approach project management at work. Write down and prioritize the list of things I’d like to work on, break out into milestones, and plot those onto a timeline view. Track progress against those timelines and adjust as needed.

And importantly, I'll be blocking out my inner critic. Enjoy the journey, rather than agonizing over the destination. Find motivation in my own growth and fulfillment, rather than external validation and material outcomes. I’ll need to better tune in to my inner child, just like my son as he was doodling that day — he could not care less how his doodles will be compared or judged, or whether he’ll be able to sell them for profit.

I hope I’ll succeed. This essay is the first blog post I’ve published since 2017, almost 6 years ago, so at least I’m already off to a good start.

Happy new year, and wish me luck!

]]>
<![CDATA[How X Window Managers Work, And How To Write One (Part III)]]>

In Part II of this series, we discussed X libraries and implementation choices, and examined the basic structure of a window manager. In Part III, we will start interacting with client windows and the user through events. We will review the fundamentals of window manager implementation, using the implementation in

]]>
https://jichu4n.com/posts/how-x-window-managers-work-and-how-to-write-one-part-iii/5bc6dbd460307a000159bd56Fri, 03 Feb 2017 07:50:00 GMT

In Part II of this series, we discussed X libraries and implementation choices, and examined the basic structure of a window manager. In Part III, we will start interacting with client windows and the user through events. We will review the fundamentals of window manager implementation, using the implementation in our example non-compositing reparenting window manager, basic_wm, for reference.

Step 4: Interaction with Application Windows

Following the steps in Part II of this series, we now have a basic skeleton for our window manager. Our next step is to start talking to clients and the user via events.

The interaction between clients, X, and the window manager is fairly complex. To facilitate our discussion, I’ve created a diagram that illustrates the flow of events throughout the lifetime of a client window, and how a window manager might respond to each of them. We’ll be referring to this cheat sheet for window manager event handling throughout this series. You can click through for the full-sized diagram.

[sB-bXhvvzFJe2u65sB-bXhvvzFJe2u65_YRwARA

In general, a window manager must handle two kinds of actions: those initiated by client applications (such as creating new windows), and those initiated by users (such as moving or minimizing windows). In this diagram, actions initiated by client applications are shown in the yellow box on the left hand side, and actions initiated by users are shown in blue on the right hand side. A window manager communicates with client applications via events, which are represented as parallelograms in red.

You may have noticed that some of the events in this diagram have the suffix Request, while others have the suffix Notify. This distinction is crucial to our discussion.

Recalling our discussion in Part I on substructure redirection, when a client application wants to do something with a window (such as moving, resizing, showing, or hiding), its request is redirected to the window manager, which can grant, modify, or deny the request. Such requests are delivered to a window manager as events with the Request suffix. It is important to understand that when a window manager receives such an event, the action it represents has not actually occurred, and it is the responsibility of the window manager to decide what to do with it. If the window manager does nothing, the request is implicitly denied.

On the other hand, events with the Notify suffix represent actions that have already been executed by the X server. The window manager can respond to such events, but of course cannot change the fact that they have already happened.

With that in mind, let’s dive into the implementation by looking at how our example window manager will handle the life cycle of a client window from creation to destruction.

Creating a Window

When an X client application creates a top-level window (XCreateWindow()), our window manager will receive a CreateNotify event. However, a newly created window is always invisible, so there’s nothing for our window manager to do. In window_manager.cpp:

void WindowManager::Run() {
  ...
  // 2. Main event loop.
  for (;;) {
    // 1. Get next event.
    ...
    // 2. Dispatch event.
    switch (e.type) {
      ...
      case CreateNotify:
        OnCreateNotify(e.xcreatewindow);
        break;
      ...
    }
  }
}

void WindowManager::OnCreateNotify(const XCreateWindowEvent& e) {}

Configuring a Newly Created Window

At this stage, the application can configure the window to set its initial size, position, or other attributes. To do so, the application would invoke XConfigureWindow(), which would send a ConfigureRequest event to the window manager. However, since the window is still invisible, the window manager doesn’t need to care and can grant such requests without modification by invoking XConfigureWindow() itself with the same parameters.

void WindowManager::Run() {
      ...
      case ConfigureRequest:
        OnConfigureRequest(e.xconfigurerequest);
        break;
      ...
}

void WindowManager::OnConfigureRequest(const XConfigureRequestEvent& e) {
  XWindowChanges changes;
  // Copy fields from e to changes.
  changes.x = e.x;
  changes.y = e.y;
  changes.width = e.width;
  changes.height = e.height;
  changes.border_width = e.border_width;
  changes.sibling = e.above;
  changes.stack_mode = e.detail;
  // Grant request by calling XConfigureWindow().
  XConfigureWindow(display_, e.window, e.value_mask, &changes);
  LOG(INFO) << "Resize " << e.window << " to " << Size<int>(e.width, e.height);
}

Mapping a Window

To make the window finally visible on screen, the client application will call XMapWindow() to map it. This sends a MapRequest event to the window manager. As noted earlier, at this point, the window is still not yet visible, as it’s up to the window manager to actually make it so. This is probably the most important event in our discussion, as this is where a window manager would usually start really managing a window.

A reparenting window manager would typically respond to a MapRequest for a client application window w with the following actions:

  1. Create a frame window f, perhaps with borders and window decoration (e.g. title, minimize / maximize / close buttons).

  2. Register for substructure redirect on f with XSelectInput(). Recall that substructure redirect only applies to direct child windows, so after reparenting, the substructure redirect previously registered on the root window would no longer apply to w, hence this step.

  3. Make w a child of f with XReparentWindow().

  4. Render f and w with XMapWindow().

  5. Register for mouse or keyboard shortcuts on w and/or f.

The example implementation in basic_wm will create a very simple frame window that has the same size as the client window, but with a 3px red border:

void WindowManager::Run() {
      ...
      case MapRequest:
        OnMapRequest(e.xmaprequest);
        break;
      ...
}

void WindowManager::OnMapRequest(const XMapRequestEvent& e) {
  // 1. Frame or re-frame window.
  Frame(e.window);
  // 2. Actually map window.
  XMapWindow(display_, e.window);
}

void WindowManager::Frame(Window w) {
  // Visual properties of the frame to create.
  const unsigned int BORDER_WIDTH = 3;
  const unsigned long BORDER_COLOR = 0xff0000;
  const unsigned long BG_COLOR = 0x0000ff;

  // 1. Retrieve attributes of window to frame.
  XWindowAttributes x_window_attrs;
  CHECK(XGetWindowAttributes(display_, w, &x_window_attrs));

  // 2. TODO - see Framing Existing Top-Level Windows section below.

  // 3. Create frame.
  const Window frame = XCreateSimpleWindow(
      display_,
      root_,
      x_window_attrs.x,
      x_window_attrs.y,
      x_window_attrs.width,
      x_window_attrs.height,
      BORDER_WIDTH,
      BORDER_COLOR,
      BG_COLOR);
  // 3. Select events on frame.
  XSelectInput(
      display_,
      frame,
      SubstructureRedirectMask | SubstructureNotifyMask);
  // 4. Add client to save set, so that it will be restored and kept alive if we
  // crash.
  XAddToSaveSet(display_, w);
  // 5. Reparent client window.
  XReparentWindow(
      display_,
      w,
      frame,
      0, 0);  // Offset of client window within frame.
  // 6. Map frame.
  XMapWindow(display_, frame);
  // 7. Save frame handle.
  clients_[w] = frame;
  // 8. Grab events for window management actions on client window.
  //   a. Move windows with alt + left button.
  XGrabButton(...);
  //   b. Resize windows with alt + right button.
  XGrabButton(...);
  //   c. Kill windows with alt + f4.
  XGrabKey(...);
  //   d. Switch windows with alt + tab.
  XGrabKey(...);

  LOG(INFO) << "Framed window " << w << " [" << frame << "]";
}

The outline of the code should be fairly clear following our discussion. A few additional points to note:

  • Regarding the save-set and XAddToSaveSet():

    The save-set is a list of windows, usually maintained by the window manager, but including only windows created by other clients. If the window manager dies, all windows listed in the save-set will be reparented back to their closest living ancestor if they were reparented in the first place and mapped if the window manager has unmapped them so that it could map an icon.

    The save-set is necessary because the window manager might not exit normally. The user might kill it with CTRL-C if it is running in the foreground, or more likely, the user might get the process number and kill it. Actually, the actions of the save-set are performed even if the window manager exits normally, so less code is needed since the save-set does the cleaning up.

    Window managers almost always place in the save-set all the windows they reparent or iconify, using XAddToSaveSet().

    Windows are automatically removed from the save-set when they are destroyed.

    — Xlib Programming Manual §16.4

  • When our window manager creates a frame window (step 2 in the example code), it will also trigger a CreateNotify event for the frame window. It will ignore it just like it ignores other CreateNotify events as discussed earlier.

  • When our window manager calls XReparentWindow() in step 5 in the example code, it will trigger a ReparentNotify event, which it will ignore:

    void WindowManager::Run() {
          ...
          case ReparentNotify:
            OnReparentNotify(e.xreparent);
            break;
          ...
    }
    
    void WindowManager::OnReparentNotify(const XReparentEvent& e) {}
    
  • When our window manager calls XMapWindow() to map the frame window (step 6 in the example code), the X server knows that the action originates from the current window manager, and will execute it directly instead of redirecting it back as a MapRequest event. Our window manager will later receive a MapNotify event, which it can ignore:

    void WindowManager::Run() {
          ...
          case MapNotify:
            OnMapNotify(e.xmap);
            break;
          ...
    }
    
    void WindowManager::OnMapNotify(const XMapEvent& e) {}
    

Configuring a Mapped Window

A client application can configure a window that is currently visible, again with the XConfigureWindow() function. For example, an application may want to resize a window to better accomodate its contents. When a reparenting window manager receives the resulting ConfigureRequest and decides to grant the request, it additionally needs to resize / reposition the corresponding frame window and any window decorations.

void WindowManager::OnConfigureRequest(const XConfigureRequestEvent& e) {
  XWindowChanges changes;
  // Copy fields from e to changes.
  ...
  if (clients_.count(e.window)) {
    const Window frame = clients_[e.window];
    XConfigureWindow(display_, frame, e.value_mask, &changes);
    LOG(INFO) << "Resize [" << frame << "] to " << Size<int>(e.width, e.height);
  }
  // Grant request by calling XConfigureWindow().
  ...
}

When our window manager re-configures the frame window with the XConfigureWindow() call above, the X server knows that the action originates from the current window manager, and will execute it directly instead of redirecting it back as a ConfigureRequest event. Our window manager will then receive a ConfigureNotify event, which it will ignore:

void WindowManager::Run() {
      ...
      case ConfigureNotify:
        OnConfigureNotify(e.xconfigure);
        break;
      ...
}

void WindowManager::OnConfigureNotify(const XConfigureEvent& e) {}

Unmapping a Window

When a client application unmaps (i.e. hides) a window with XUnmapWindow(), for example in response to the user exiting or minimizing the application, the window manager will receive a UnmapNotify event. Unlike the MapRequest event, the UnmapNotify event is delivered to the window manager after the fact, and the window manager can only respond to it, not intercept it.

A reparenting window manager will typically want to reverse the actions it performed in response to MapRequest. In other words, it would reparent the client window back to the root window, and destroy the corresponding frame window.

void WindowManager::Run() {
      ...
      case UnmapNotify:
        OnUnmapNotify(e.xunmap);
        break;
      ...
}

void WindowManager::OnUnmapNotify(const XUnmapEvent& e) {
  // If the window is a client window we manage, unframe it upon UnmapNotify. We
  // need the check because we will receive an UnmapNotify event for a frame
  // window we just destroyed ourselves.
  if (!clients_.count(e.window)) {
    LOG(INFO) << "Ignore UnmapNotify for non-client window " << e.window;
    return;
  }

  Unframe(e.window);
}

void WindowManager::Unframe(Window w) {
  // We reverse the steps taken in Frame().
  const Window frame = clients_[w];
  // 1. Unmap frame.
  XUnmapWindow(display_, frame);
  // 2. Reparent client window back to root window.
  XReparentWindow(
      display_,
      w,
      root_,
      0, 0);  // Offset of client window within root.
  // 3. Remove client window from save set, as it is now unrelated to us.
  XRemoveFromSaveSet(display_, w);
  // 4. Destroy frame.
  XDestroyWindow(display_, frame);
  // 5. Drop reference to frame handle.
  clients_.erase(w);

  LOG(INFO) << "Unframed window " << w << " [" << frame << "]";
}

A few additional points to note:

  • When our window manager unmaps the frame window with XUnmapWindow() in step 1 in the example code above, it will again receive a corresponding UnmapNotify event. This is the reason why the UnmapNotify event handler needs to check that the unmapped window is an actual client window.

  • When our window manager makes the client window a direct child of the root window with XReparentWindow() in step 2 above, it will receive a ReparentNotify event. As discussed in the Mapping a Window section above, this ReparentNotify event will be ignored.

  • When our window manager destroys the frame window with XDestroyWindow() in step 4, it will trigger a DestroyNotify event. This event will also be ignored, as shown in the next section.

At this point, the client window has become invisible, but not yet destroyed. It can be displayed again with a call to XMapWindow(), which would take us back to the Mapping a Window step. It could also be reconfigured in this state, which would take us back to the Configuring a Newly Created Window step.

Destroying a Window

When a client application exits or no longer needs a window, it will call XDestroyWindow() to dispose of the window. This triggers a DestroyNotify event. In our case, there’s nothing we need to do in response.

void WindowManager::Run() {
      ...
      case DestroyNotify:
        OnDestroyNotify(e.xdestroywindow);
        break;
      ...
}

void WindowManager::OnDestroyNotify(const XDestroyWindowEvent& e) {}

Framing Existing Top-Level Windows

Now that we’ve walked through the life cycle of a client window, from creation to destruction, let’s turn our attention to the problem of existing top-level windows.

You may recall from Part I that X applications in general run just fine without a window manager. Depending on how an X session is started (e.g. xinitrc), by the time a window manager starts, any number of windows may have already been created by other applications. Additionally, the user can kill a running window manager and replace it with a different window manager, without affecting windows from other applications.

Therefore, when our window manager starts up, it needs to handle any existing top-level windows that are already mapped. As a reparenting window manager, it will invoke the same Frame() function on such windows as if these windows are being mapped for the first time:

void WindowManager::Run() {
  // 1. Initialization.
  //   a. Select events on root window. Use a special error handler so we can
  //   exit gracefully if another window manager is already running.
  ...
  //   b. Set error handler.
  ...
  //   c. Grab X server to prevent windows from changing under us while we
  //   frame them.
  XGrabServer(display_);
  //   d. Frame existing top-level windows.
  //     i. Query existing top-level windows.
  Window returned_root, returned_parent;
  Window* top_level_windows;
  unsigned int num_top_level_windows;
  CHECK(XQueryTree(
      display_,
      root_,
      &returned_root,
      &returned_parent,
      &top_level_windows,
      &num_top_level_windows));
  CHECK_EQ(returned_root, root_);
  //     ii. Frame each top-level window.
  for (unsigned int i = 0; i < num_top_level_windows; ++i) {
    Frame(top_level_windows[i], true /* was_created_before_window_manager */);
  }
  //     iii. Free top-level window array.
  XFree(top_level_windows);
  //   e. Ungrab X server.
  XUngrabServer(display_);

  // 2. Main event loop.
  ...
}

void WindowManager::OnMapRequest(const XMapRequestEvent& e) {
  // 1. Frame or re-frame window.
  Frame(e.window, false /* was_created_before_window_manager */);
  ...
}

void WindowManager::Frame(Window w, bool was_created_before_window_manager) {
  ...
  // 1. Retrieve attributes of window to frame.
  ...
  // 2. If window was created before window manager started, we should frame
  // it only if it is visible and doesn't set override_redirect.
  if (was_created_before_window_manager) {
    if (x_window_attrs.override_redirect ||
        x_window_attrs.map_state != IsViewable) {
      return;
    }
  }
  // 3. Create frame.
  ...
}

void WindowManager::OnUnmapNotify(const XUnmapEvent& e) {
  ...

  // Ignore event if it is triggered by reparenting a window that was mapped
  // before the window manager started.
  //
  // Since we receive UnmapNotify events from the SubstructureNotify mask, the
  // event attribute specifies the parent window of the window that was
  // unmapped. This means that an UnmapNotify event from a normal client window
  // should have this attribute set to a frame window we maintain. Only an
  // UnmapNotify event triggered by reparenting a pre-existing window will have
  // this attribute set to the root window.
  if (e.event == root_) {
    LOG(INFO) << "Ignore UnmapNotify for reparented pre-existing window "
              << e.window;
    return;
  }

  Unframe(e.window);
}

Some additional things to note:

  • You may notice that the process of framing existing top-level windows is guarded by XGrabServer() and XUngrabServer(). From the Xlib Programming Manual:

    These functions can be used to control processing of output on other connections by the window system server. While the server is grabbed, no processing of requests or close downs on any other connection will occur.

    — Xlib Programming Manual §9.5

    By grabbing the X server, our window manager ensures that, between the time when it fetches the list of existing top-level windows and when it finishes framing them, no other application can interfere and mess up our state: no new windows can be created, and no existing windows can be modified or destroyed.

  • The override_redirect attribute, if set to true, indicates that a window should not be managed by window managers. From the Xlib Programming Manual:

    To control window placement or to add decoration, a window manager often needs to intercept (redirect) any map or configure request. Pop-up windows, however, often need to be mapped without a window manager getting in the way. […]

    The override-redirect flag specifies whether map and configure requests on this window should override a SubstructureRedirectMask on the parent. You can set the override-redirect flag to True or False (default). Window managers use this information to avoid tampering with pop-up windows […].

    — Xlib Programming Manual §3.2.8

    The reason our window manager doesn’t need to check for this attribute except at start up is that the X server knows not to redirect events from such windows:

    The window manager […] will normally ignore windows that are mapped with their override_redirect attribute set, since no *Request events will be generated for them.

    — Xlib Programming Manual §16.3

  • The map_state attribute indicates whether a window is currently visible (mapped). When Frame() is invoked for pre-existing windows during start up, we want to ignore windows that are currently unmapped. However, when Frame() is invoked during the event loop as part of the MapRequest handler, we know that the client window to be framed is necessarily still unmapped, as our window manager wouldn’t have granted the request yet.

  • You might be wondering why an additional check for e.event == root_ is needed in the UnmapNotify handler. It turns out that reparenting an already mapped window (XReparentWindow()) will trigger a pair of UnmapNotify and MapNotify events in addition to ReparentNotify. Therefore, when we enter into the event loop, we will receive an UnmapNotify event for every pre-existing top-level window we reparented. We can distinguish these events by their event attribute, which in this case represents the parent of the client window. Normally, when a client window we already framed is unmapped, the event attribute would be its frame window. But when a pre-existing window is reparented at start up, the event attribute in the resulting UnmapNotify event will be its original parent - i.e., the root window.

What’s Next

At this point, we have a basic but functional reparenting window manager that will correctly handle the life cycle of windows. If you strip out window decorations, shortcuts and fancy UI, the core structure of every X window manager will quite closely resemble what we have here.

In our next installment, we will improve the user-facing functionality of our window manager by adding ways to move, resize and close windows. In the meantime, you’re more than welcome to check out the code for basic_wm on GitHub.

]]>
<![CDATA[Flashing a Sprint Nexus S 4G to Verizon]]>

I originally wrote the following post in July 2012 to document how to fully flash a Nexus S 4G from Sprint to Verizon Wireless, but never got to publishing it. I have long since switched away from Verizon Wireless and no longer own any of the phones mentioned, and decided

]]>
https://jichu4n.com/posts/flashing-a-sprint-nexus-s-4g-to-verizon/5bc6e46460307a000159bd9fWed, 11 Nov 2015 08:27:00 GMT

I originally wrote the following post in July 2012 to document how to fully flash a Nexus S 4G from Sprint to Verizon Wireless, but never got to publishing it. I have long since switched away from Verizon Wireless and no longer own any of the phones mentioned, and decided to publish it for what it’s worth.

If you’re interested in the high-level view of how CDMA phones are programmed, check out my previous article Carrier Programming on CDMA Android Phones.

Introduction

This is a guide for flashing a Sprint Samsung Nexus S 4G to a standard Verizon monthly plan, and voice, texting and 3G data will all be fully functional. This method will likely also work on many other phones, especially Samsung ones, and should theoretically work just file on any ROM as well.

Note that this procedure entails ESN cloning, which may be illegal in your region. Please make sure you have an understanding of all applicable laws before proceeding. The author of this guide cannot be held responsible for any legal infractions that may ensue. Finally, you are solely responsible for any consequences of your actions as a result of following this guide. While I believe it to be quite safe, I cannot guarantee you that this process will not brick your devices or start a zombie apocalypse.

Requirements and Setup

We will need the following before starting:

  • A currently active Verizon Wireless Android phone and plan. Please beware that not all phones will work; many phones do not allow us to extract all the information we need for this process. For instance, my Motorola Droid 3 did not work, and I had to purchase and activate an HTC Incredible for this. This phone will be our donor phone.

  • A rooted Samsung Nexus 4G.

  • A computer running a recent version of Windows, with at least 2 USB ports.

  • 2 MicroUSB cables.

In addition, we need the following software:

  • USB diagnostic drivers for your donor phone. These vary by manufacturer and their configuration may be quite complex; search online for how to set up yours. For HTC phones, download and install HTC Sync from HTC’s website (archived here).

  • USB diagnostic drivers for the Nexus S 4G. The easiest way is to download and install PdaNet (archived here). We don’t actually need the functionality of PdaNet, but it happens to bundle the drivers we need. Select Samsung when prompted for the manufacturer of our phone.

  • QPST and QXDM (archived here). These are internal Qualcomm tools; we will use them to clone the MEID of the donor phone.

  • DFS (archived here). This is a third-party tool for working with CDMA phones. The demo version will suffice.

Step 1: Preparation

First, make sure the donor phone is fully activated and that calls, text and data are all fully functional.

Now, find the HEX MEID of the donor phone. On the HTC Incredible, this can be found under Settings → About Phone → Phone Identity. On most Android phones, this is under Settings → About Phone → Status. It is also usually printed on a label underneath the phone battery. This should be 14 digits and usually begins with A00000.

Now put the donor phone in airplane mode.

We will next connect both phones to the computer in diagnostic mode.

  • On the Nexus S 4G, dial *#*#8778#*#*, then select MODEM under USB. Connect the Nexus S 4G to the computer. If drivers were installed correctly, you should now see a SAMSUNG Mobile Modem Diagnostic Serial Port (WDM) under Ports (COM & LPT) in Windows’s Device Manager. Note down the port number of the Nexus S 4G; we will assume COM4 for convenience.

  • On the HTC Incredible, dial ##3424#, hit Call, connect it to the computer, then follow the instructions in this thread with Device Manager. If everything goes well you should now see a HTC Diagnostic Interface under Ports (COM & LPT) in Device Manager. Note down the port number of the HTC Incredible; we will assume COM11 for convenience.

After confirming that both phones show up under Ports (COM & LPT), run QPST Configuration and hit Add New Port…. You should see both phones on the left. Select one and hit OK. Then repeat for the other phone.

Step 2: Cloning

We will now clone the MEID of the donor phone onto the target phone. Open QXDM Professional. Select Options → Communications…. For Target Port, select the port corresponding to the Nexus S 4G, say COM4, and hit OK. Then in the text box labeled Command, type the following:

Password 01F2030F5F678FF9

Hit Enter. You should see Password Result = Correct at the bottom of the Command Output window.

Now in the Command text box, type the following, replacing <MEID> by the hex MEID of the donor phone:

RequestNVItemWrite meid 0x<MEID>

Hit Enter. You should see it repeat back the MEID to you. Type RequestNVItemRead meid followed by Enter to make sure the write happened.

Close QXDM Professional and reboot the Nexus S 4G. Rebooting the target phone after writing a new MEID is essential! If prompted, do NOT go through the activation process on the Nexus S 4G. Just select Skip when possible.

Step 3: Reset SPC

In order to program the Nexus S 4G, we need to obtain or reset the SPC code (also known as the MSL) of the phones. If you already know the SPC code for your phones and would rather not reset the SPC code, you can skip this step.

Open QXDM Professional again. Go through Options → Communications and make sure the Target Port is still COM4. Hit OK. In the command text box, type the following two lines, each followed by an Enter:

Password 01F2030F5F678FF9
RequestNVItemWrite sec_code 000000

If this is successful, the SPC on the Nexus S 4G will have been reset to 000000. Repeat for the donor phone (in our example, on COM11).

Close QXDM Professional after done. All QPST / QXDM programs must be closed before proceeding, or DFS will complain.

Step 4: Programming CDMA Chipset

Launch DFS. Click Ports on the top left. In the dialog box, double-click on each port representing our phones, in our case, COM4 and COM11, and close the dialog box. Now make sure the SPC text box above reads 000000 (or if you chose not to reset the SPC, whatever your SPC is). Type 01F2030F5F678FF9 into the Pwd text box.

Step 4.a: Basic CDMA settings

Let’s now copy basic CDMA settings from the donor phone to the target phone. In the drop-down menu next to Ports, select the donor phone (COM11). Switch to the Programming tab and the NAM tab under it. Click the SPC and Pwd buttons to send the SPC and NV password to the phone to unlock it. Then hit Read beneath Network identification. The text fields below should be populated by information from the donor phone. Now select COM4 in the drop-down menu next to Ports, click SPC and Pwd, then click Write beneath Network identification.

Step 4.b: PRL

Now, let’s copy over the PRL (or Preferred Roaming List - essentially a database of tower locations). Again, select COM11 under Ports and click Read under PRL towards the right. Then select COM4 under Ports and click Write under PRL.

Step 4.c: 2G Data settings

Now go to the next tab, Data. Select COM11 and hit Read. Switch to COM4 and hit Write. If you observe errors in the log window in the bottom, and / or no information shows up in the Pwd boxes under PPP and HDR AN Long, your donor phone doesn’t support the extraction of some data passwords, and you will have to try a different donor phone.

Step 4.c: 3G Data settings

We do the same with Mobile IP. Select COM11 and hit Read. Switch to COM4 and hit Write. Then hit Write current profile settings. Again, if you encounter errors, or no information shows up under AAA Shared secret and HA Shared Secret, you will have to try a different donor phone.

Finally, we need to copy over a file containing a secret key used to establish a 3G connection. Go to the EFS tab towards the top. Select COM11 and hit Read EFS. Click the "+" sign of the root folder on the left and then the "+" sign next to the DMU folder and you should see a file called 10.key. Right-click on this file and hit Save…. Save this file somewhere on your hard drive. Now switch to COM4 and hit Read EFS again. Select the root folder on the left, type /DMU under Path (57 max) on the right and hit Add Item to create a folder named DMU on the Nexus S 4G. Then select DMU on the left, click on the check box named I want to add file from PC, and select the 10.key file we just saved. Then hit Add Item to upload the 10.key file to /DMU/10.key.

The programming is now done. Close DFS, select OK and OK again when prompted whether to send a mode reset to the phone. The Nexus S 4G will now reboot.

Step 5: Programming Android

We will now modify some Android system files to Verizon settings. This step will need to be repeated whenever you flash a new ROM on the Nexus S 4G in the future.

1-click programming app

I have built a 1-click Android application that automates this step; you can find it here. It requires root as it will need to modify system files. Install the application and click the button in the middle. When the process finishes, you will be prompted to reboot. Simply reboot your phone and you’re all set; you may uninstall the app, or you could keep it around for when you update to a new ROM.

If the 1-click app does not work for you, or if you want more control over what it does, follow the steps below. You do not need to do any of the following if you already used the app above.

Step 5.a: Fix voice calling

This is required to enable calls on Verizon. On the Nexus S 4G, connect to WiFi and install ES File Explorer from the Play Store. Then go to Menu → Settings and check Up to Root, Root Explorer, and Mount File System. Allow it to use superuser privileges if asked. Then navigate to /system and open build.prop with ES Note Editor. Find the following two lines:

ro.cdma.home.operator.numeric=310120
ro.cdma.home.operator.alpha=Sprint

Change them to read:

ro.cdma.home.operator.numeric=310004
ro.cdma.home.operator.alpha=Verizon

Then add a line below (or anywhere in the file) that says:

ro.cdma.homesystem=64,65,76,77,78,79,80,81,82,83

Hit Menu → Save and then Back to exit.

Step 5.b: Fix roaming

This step tells Android to treat the Verizon network as the home network. It sets the roaming status to "Not roaming", removes the triangular roaming indicator in the status bar and changes the displayed name of the current network to "Verizon Wireless".

Now navigate to /system/framework and copy framework-res.apk onto your computer - you can use adb or email it to yourself or whatever (you have to copy it to the internal SD card before you can email it though). Once on the computer, rename it framework-res.zip (yes, an APK is just a zip archive) and use your favorite program to replace the file res/xml/eri.xml inside it with the one at this link. Now rename it back to framework-res.apk, copy it back onto the Nexus S 4G (using adb or whatever) and use ES File Explorer to overwrite the existing /system/framework/framework-res.apk with it.

Step 5.c: Add Verizon APNs

This last step sets up Verizon APNs that enable web and MMS. Use ES File Explorer or any other tool to replace /system/etc/apns-conf.xml with the one at this link.

Step 6: Reboot & PROFIT

The Nexus S 4G should now be a fully functional Verizon Wireless phone. If you are curious about what each of the steps means, you’re welcome to check out my previous article Carrier Programming on CDMA Android Phones for a high-level view of how CDMA phones are programmed.

While your mileage may vary, I hope the above information have been of help. Since I no longer use Verizon Wireless and no longer own any of the phones mentioned above, please take all this information with a grain of salt.

Good luck, and happy hacking!

]]>
<![CDATA[Setting Up DKIM And SRS In Postfix]]>

In my previous post Custom Domain E-mails With Postfix And Gmail: The Missing Tutorial, we set up a Postfix mail server on a custom domain that integrates seamlessly with Gmail.

However, the tutorial skipped two important security standards that will help prevent e-mails routed through our server from being marked

]]>
https://jichu4n.com/posts/setting-up-dkim-and-srs-in-postfix/5bc6df7260307a000159bd7fTue, 09 Dec 2014 08:06:00 GMT

In my previous post Custom Domain E-mails With Postfix And Gmail: The Missing Tutorial, we set up a Postfix mail server on a custom domain that integrates seamlessly with Gmail.

However, the tutorial skipped two important security standards that will help prevent e-mails routed through our server from being marked as spam: DKIM and SRS. This article will show you how to add support for DKIM and SRS to a Postfix server.

Similar to the previous tutorial, we will assume a Ubuntu server in our examples.

Step 1: DKIM

DKIM, short for DomainKeys Identified Mail, is a mechanism for

  1. A sender e-mail program to sign an outgoing e-mail message, and

  2. A recipient e-mail program to verify said signature.

More concretely, let’s say Gmail receives an e-mail message from myserver.example.com. DKIM allows Gmail to verify that the e-mail was indeed sent by the designated e-mail server program on myserver.example.com, and not by, say, a virus running on myserver.example.com or a malicious user who happens to have access to myserver.example.com.

We will use OpenDKIM for this tutorial. To install OpenDKIM on Ubuntu:

$ sudo apt-get install opendkim opendkim-tools

/etc/opendkim.conf

Edit /etc/opendkim.conf to match the following:

# OpenDKIM config.

# Log to syslog
Syslog                  yes
SyslogSuccess           yes
LogWhy                  yes
# Required to use local socket with MTAs that access the socket as a non-
# privileged user (e.g. Postfix)
UMask                   002

Mode                    sv
PidFile                 /var/run/opendkim/opendkim.pid
UserID                  opendkim:opendkim
Socket                  inet:12301@localhost

Canonicalization        relaxed/simple
SignatureAlgorithm      rsa-sha256

# Sign for example.com with key in /etc/opendkim.d/mail.private using
# selector 'mail' (e.g. mail._domainkey.example.com)
Domain                  example.com
KeyFile                 /etc/opendkim.d/mail.private
Selector                mail

ExternalIgnoreList      refile:/etc/opendkim.d/TrustedHosts
InternalHosts           refile:/etc/opendkim.d/TrustedHosts

You can check out a detailed explanation for the meaning of each option with man opendkim.conf.

/etc/opendkim.d/TrustedHosts

Create the directory /etc/opendkim.d and put the following in /etc/opendkim.d/TrustedHosts. This instructs the OpenDKIM server to sign e-mails delivered by any server matching these expressions (such as myserver_2.example.com).

127.0.0.1
::1
localhost
192.168.0.1/24

*.example.com

DKIM keys

Now, let’s generate our DKIM signing keys:

$ cd /etc/opendkim.d
$ sudo opendkim-genkey -s mail -d example.com

This will produce two files in /etc/opendkim.d: our private key, mail.private, and our public key, mail.txt.

We should make sure only OpenDKIM can read the private key, so that a malicious program or user on the same server won’t be able to forge our signature:

$ chmod 600 mail.private
$ chown opendkim:opendkim mail.private

DKIM public key DNS record

The next step is to publish our public key through DNS, so that any recipient e-mail program can verify our signature. If we look at our public key mail.txt generated in the previous step, it should look like something like this:

mail._domainkey IN      TXT     ( "v=DKIM1; k=rsa; "
          "p=<alphabetical soup>" )  ; ----- DKIM key mail for example.com

Create the following TXT DNS record for example.com (how to update DNS records will depend on the DNS hosting provider):

  • Name: mail._domainkey.example.com

  • Value: v=DKIM1; k=rsa; p=<alphabetical soup>

where <alphabetical soup> is the public key found in mail.txt after p=. Note that DNS records will take a while (depending on our provider, up to a day) to propagate.

OpenDKIM server

We can now start the OpenDKIM server with

$ sudo /etc/init.d/opendkim start

Postfix

The last step is to tell Postfix to use OpenDKIM to sign outgoing e-mail messages. Add the following to /etc/postfix/main.cf:

# Milter settings.
milter_protocol = 2
milter_default_action = accept
# OpenDKIM runs on port 12301.
smtpd_milters = inet:localhost:12301
non_smtpd_milters = inet:localhost:12301

If you already have other milters configured (such as SpamAssassin), simply add inet:localhost:12301 to your existing smtpd_milters and non_smtpd_milters lines, prefixed by a comma.

Let’s now restart Postfix with the new configuration:

$ sudo postfix reload

…and we’re done!

Step 2: SRS

SRS, short for Sender Rewriting Scheme, is a standard for including forwarding / relay information in a forwarded / relayed e-mail message.

For example, suppose alice@hotmail.com sends an e-mail to john@example.com, and our Postfix server on myserver.example.com forwards this e-mail to john123@gmail.com. SRS allows our Postfix server on myserver.example.com to attach a virtual sticky note on the e-mail message explaining this situation to Gmail. Otherwise, Gmail might become suspicious of why myserver.example.com is producing messages that purport to come from hotmail.com, which spammers and phishers are wont to do.

We will use PostSRSd to implement SRS in our Postfix server. It works out of the box with Postfix and is a breeze to set up, but unfortunately is not included in the official Ubuntu / Debian package repositories.

OpenSRSd

Let’s build and install OpenSRSd from source:

# Dependencies.
$ sudo apt-get install unzip cmake

# Download and extract source code from GitHub.
$ cd /tmp
$ curl -L -o postsrsd.zip \
    https://github.com/roehling/postsrsd/archive/master.zip
$ unzip postsrsd.zip

# Build and install.
$ cd postsrsd-master
$ mkdir build
$ cd build
$ cmake -DCMAKE_INSTALL_PREFIX=/usr ../
$ make
$ sudo make install

The default config provided by PostSRSd (/etc/default/postsrsd) will pretty much work out of the box for our case.

The install script will also conveniently install an Upstart script for PostSRSd. Let’s start it now:

$ sudo service postsrsd start

Postfix

Finally, we configure Postfix to use PostSRSd. Add the following to /etc/postfix/main.cf:

# PostSRSd settings.
sender_canonical_maps = tcp:localhost:10001
sender_canonical_classes = envelope_sender
recipient_canonical_maps = tcp:localhost:10002
recipient_canonical_classes= envelope_recipient,header_recipient

And restart Postfix with the new configuration:

$ sudo postfix reload

That’s it!

]]>
<![CDATA[Custom Domain E-mails With Postfix and Gmail: The Missing Tutorial]]>

Custom domain e-mail addresses, like john@johnscompany.com, are cool and professional-looking. Would you want to e-mail potential clients as john123@gmail.com?

On the other hand, Gmail has great reliability, speed, spam filtration, and app support. Wouldn’t it be great if we could send and receive e-mail

]]>
https://jichu4n.com/posts/custom-domain-e-mails-with-postfix-and-gmail-the-missing-tutorial/5bc6de6560307a000159bd6fThu, 06 Nov 2014 08:01:00 GMT

Custom domain e-mail addresses, like john@johnscompany.com, are cool and professional-looking. Would you want to e-mail potential clients as john123@gmail.com?

On the other hand, Gmail has great reliability, speed, spam filtration, and app support. Wouldn’t it be great if we could send and receive e-mail as john@johnscompany.com, but right from the comfort of our Gmail account?

Today, if we want to use Gmail to send and receive e-mails on a custom domain, we have a couple of options:

  • Google Apps for Work is Google’s own offering. It’s simple to set up and manage, but it’s pretty pricey at $5/mo per user, especially if we don’t need the bundled collaborative features like Google Docs and Calendar.

  • Third-party services could get the job done at a lower price point; for example, Pobox offers plans starting at $20/yr. The downside is that we will have to trust another organization with our e-mail.

  • Set up our own e-mail forwarding server. It’s fairly easy (if we follow this guide :) and if we already have shared / dedicated hosting or a VPS, this is basically free and gives us total control over our e-mail.

If you want to host custom domain e-mails in Gmail without paying for Google Apps or a third-party service, this tutorial is for you. It will show you how to quickly set up you own e-mail forwarding / relay server, and how to integrate it seamlessly with Gmail.

Prerequisites

For this tutorial, we’ll assume we have access to:

  • VPS or shared / dedicated hosting, with a dedicated IP address and capable of listening on ports 25 and 587. We will use a Ubuntu server for the examples.

  • DNS records for our domain. How DNS records are manipulated depends on the DNS hosting provider (or DNS server configuration).

We’ll assume

  • We have a domain example.com;

  • We have a server myserver.example.com;

  • We want to use the Gmail account john123@gmail.com to manage john@example.com.

Step 1: DNS Setup

The first step in setting up our e-mail forwarding server is to add MX, PTR and SPF DNS records for our server. This is really important as many e-mail providers, in order to prevent spam, will refuse to talk to mail servers without proper MX, PTR and SPF records set up. How to update DNS records will depend on the DNS hosting provider.

The following is what we need:

  • An MX record for example.com, pointing to myserver.example.com.

    This tells the world that e-mails to <whatever>@example.com should be delivered to the server myserver.example.com.

  • An A record for myserver.example.com, pointing to the IP address of our server.

  • A PTR (reverse DNS) record mapping the IP address of our server to myserver.example.com.

    This allows Gmail to verify the legitimacy of our server via its IP when Gmail receives a forwarded e-mail from it.

  • A TXT record, with key example.com and value v=spf1 mx ~all.

    This is an SPF record; it tells Gmail that the servers specified in the MX records of example.com, in this case only myserver.example.com, are allowed to send e-mails purporting to be from <whatever>@example.com. All other servers attempting to do the same will be rejected. This should be a sane default value, but feel free to custom it as you like.

DNS records will take a while (depending on our provider, up to a day) to propagate. Until they do, e-mails forwarded by our new e-mail server may get marked as spam or rejected outright.

Step 2: Receiving E-mail

We’ll first set up an e-mail forwarding server which will let we receive e-mails sent to our domains.

We will use Postfix as the e-mail server. Let’s start by installing the necessary packages. We will assume a Ubuntu server in the examples below; if you’re using a different distribution, please consult your distribution’s documentation for the right commands.

$ sudo DEBIAN_FRONTEND=noninteractive apt-get install postfix

In the above command, we skip the debconf configuration UI with DEBIAN_FRONTEND=noninteractive as we will edit the configuration files directly.

/etc/postfix/main.cf

Open up /etc/postfix/main.cf in your favorite editor. This file will come pre-populated with a bunch of config options and comments. Replace the contents of the file with the following (you can comment out the original contents for reference):

# /etc/postfix/main.cf

# Host and site name.
myhostname = myserver.example.com
mydomain = example.com
myorigin = example.com

# Virtual aliases.
virtual_alias_domains = example.com
virtual_alias_maps = hash:/etc/postfix/virtual

The first few lines are pretty straightforward; they tell Postfix how to identify itself to the world. The last two lines tell Postfix to forward e-mails sent to <whatever>@example.com to another e-mail provider (Gmail), and that the forwarding is configured in the database file /etc/postfix/virtual.

/etc/postfix/virtual

Let’s now open up /etc/postfix/virtual and fill in our forwarding configuration:

# /etc/postfix/virtual

# Forwarding mapping, one from-to address pair per line. The format is:
#     <forward-from-addr> <whitespace> <forward-to-addr>
john@example.com        john123@gmail.com

We can add as many forwarding rules as you want, one on each line. We can use any number of tabs / whitespaces between the forward-from and forward-to addresses.

Update lookup table

It turns out that Postfix doesn’t actually read /etc/postfix/virtual (surprise!); instead, what it reads is a lookup table generated from it. So, let’s generate the lookup table from our /etc/postfix/virtual:

$ sudo postmap /etc/postfix/virtual

Note: we must re-run this command every time we modify /etc/postfix/virtual!

(Re)start Postfix

It’s now time to (re)start Postfix with our new configuration:

$ sudo postfix start
$ sudo postfix reload

Testing

We should test our brand-new Postfix server by sending an e-mail to our forward-from address. You might want to check out these tutorials for testing directly with the telnet command.

If you received the e-mail in your Gmail inbox, congratulations! Otherwise, check the Postfix logs at /var/log/mail.log (or with journalctl -u postfix if using Systemd) for errors. The most likely cause of issues is that DNS records have not propagated yet, in which case we’re likely to see a "rate limited" or "rejected for spam" type error message.

If we only want to receive but don’t care about sending e-mails as john@example.com, then this is all we need.

Otherwise, let’s move on to configuring Postfix to support sending e-mails from Gmail.

Step 3: Sending E-mail

Gmail requires a relay server (a server that will send e-mails to their destination on behalf it) to speak TLS, which protects the communication between Gmail and the relay server.

We will use Cyrus SASL for this task. To install:

$ sudo apt-get install sasl2-bin libsasl2-modules

User name & password

Open relays are a terrible idea. We want to protect our e-mail server with a user name and password so that it will let Gmail send e-mails through it, but block spammers and other evil actors.

Cyrus SASL supports several backends for storing user names and passwords, including MySQL and PAM. However, we will go with the simplest backend — a plain database file.

Let’s create the user name / password database file in the default location /etc/sasldb2, with a single user named smtp (we can change the user name to anything we want):

$ sudo saslpasswd2 -c -u example.com smtp

Verify with:

$ sudo sasldblistusers2

Now, we make sure only Postfix can read this file:

$ sudo chmod 400 /etc/sasldb2
$ sudo chown postfix /etc/sasldb2

Lastly, we tell Cyrus SASL to use the file-based database to authenticate. Create the file /etc/postfix/sasl/smtpd.conf:

pwcheck_method: auxprop
auxprop_plugin: sasldb
mech_list: PLAIN LOGIN CRAM-MD5 DIGEST-MD5 NTLM
log_level: 7

SSL Certificate

We need an SSL certificate to enable TLS. While a proper SSL certificate signed by a Certificate Authority (CA) can be pricey, it turns out that a simple self-signed certificate suffices and works just fine.

So, let’s generate one.

  1. Generate an RSA private/public key pair. Note that you MUST supply a password to this command; the password will be removed in step 3.

    $ openssl genrsa -des3 -out example.key 1024
    
  2. Generate a Certificate Signing Request (CSR). Make sure to enter myserver.example.com when prompted for the "Common Name".

    $ openssl req -new -key example.key -out example.csr
    
  3. Remove RSA private/public key password.

    $ mv example.key example.key.orig
    $ openssl rsa -in example.key.orig -out example.key
    
  4. Generate a self-signed certificate. In the example, the generated certificate will be valid for 10 years.

    $ openssl x509 -req \
        -days 3650 -in example.csr -signkey example.key -out example.crt
    
  5. Create a PEM file.

    $ cat example.crt example.key > example.pem
    
  6. Move and protect the PEM file.

    $ sudo mv example.pem /etc/postfix/example.pem
    $ sudo chmod 400 /etc/postfix/example.pem
    $ sudo chown postfix /etc/postfix/example.pem
    

Note: Make sure to protect the generated private key and certificate with care!

Relay server

The final step is to configure Postfix to enable relaying of e-mail on behalf of Gmail.

Let’s open up /etc/postfix/master.cf. This file should already contain a bunch of config options, some of which are commented out. Uncomment the lines starting with submission and edit them to match the following:

submission inet n       -       n       -       -       smtpd
  -o syslog_name=postfix/submission
  -o smtpd_tls_security_level=may
  -o smtpd_tls_cert_file=/etc/postfix/example.pem
  -o smtpd_sasl_auth_enable=yes
  -o smtpd_reject_unlisted_recipient=no
  -o smtpd_relay_restrictions=permit_sasl_authenticated,reject
  -o milter_macro_daemon_name=ORIGINATING

A Postfix restart is due after all these changes:

$ sudo postfix reload

If all went well, you should see Postfix serving a relay server, protected by our user name and password in /etc/sasldb2, on port 587.

Step 4: Configure Gmail

Finally, let’s log in to Gmail and tell it about our brand new server. (Note that the new Inbox UI does not yet support all the settings available in the legacy interface at the time of writing, so these instructions apply to the legacy interface.)

Let’s go to Settings > Accounts and Import and click Add another email address you own.

gmail_1

In the dialog that pops up, fill in our target e-mail address and click Next Step.

gmail_2

In the next dialog, enter the address of our e-mail server (myserver.example.com), and the user name and password we set up using saslpasswd2 above. Make sure the user name is suffixed with our domain name (so smtp@example.com rather than just smtp). Check that the correct port (587) and the correct security protocol (TLS) are selected, then click Add Account.

gmail_3

If all goes well, we should see a dialog like the following:

gmail_4

And we should receive an e-mail, forwarded by our mail server set up in the first part, to our john123@gmail.com address. Click on the link inside the e-mail and we’re done!

Now, in any e-mail compose / reply window, you should be able to select john@example.com in the drop-down list next to "From". Congrats!

If, on the other hand, you see an error like the following:

gmail_5

You should check the Postfix logs at /var/log/mail.log (or with journalctl -u postfix if using Systemd) for any errors.

[Optional] Step 5: Setup DKIM and SRS

DKIM, short for DomainKeys Identified Mail, is e-mail validation system for preventing e-mail spoofing. Google recommends setting up DKIM; if we don’t, the e-mail messages that we forward to Gmail have a higher probability of getting marked as spam.

SRS, short for Sender Rewriting Scheme, is a relatively new standard that e-mail forwarding servers are recommended to adopt, and it also helps reduce the likelihood of our forwarded e-mail messages getting marked as spam.

Thus, while technically optional, it’s a good idea to configure our Postfix server to support both these standards. See my follow-up tutorial on Setting Up DKIM And SRS In Postfix for a detailed step-by-step guide.

Conclusion

In this post, we have set up a Postfix e-mail server that accomplishes the following:

  • E-mails sent to john@example.com is received by Postfix on myserver.example.com port 25, and forwarded to john123@gmail.com;

  • Gmail will let you select john@example.com as a "from" address when composing / replying to an e-mail, and will relay such e-mails through Postfix on myserver.example.com port 587 using a configured user name / password combination.

I hope you found this guide useful, and if you have any thoughts / questions, you’re more than welcome to leave a comment below. Cheers!

]]>
<![CDATA[How To Add Custom Build Steps and Commands To setup.py]]>

A setup.py script using distutils / setuptools is the standard way to package Python code. Often, however, we need to perform custom actions for code generation, running tests, profiling, or building documentation, etc., and we’d like to integrate these actions into setup.py. In other words, we’

]]>
https://jichu4n.com/posts/how-to-add-custom-build-steps-and-commands-to-setuppy/5bc6d8e460307a000159bd41Wed, 08 Oct 2014 06:38:00 GMT

A setup.py script using distutils / setuptools is the standard way to package Python code. Often, however, we need to perform custom actions for code generation, running tests, profiling, or building documentation, etc., and we’d like to integrate these actions into setup.py. In other words, we’d like to add custom steps to setup.py build or setup.py install, or add a new command altogether to setup.py.

Let’s see how this is done.

Adding Custom setup.py Commands and Options

Let’s implement a custom command that runs Pylint on all Python files in our project. The high level idea is:

  1. Implement command as a subclass of distutils.cmd.Command;

  2. Add the newly defined command class to the cmdclass argument to setup().

To see this in action, let’s add the following to our setup.py:

import distutils.cmd
import distutils.log
import setuptools
import subprocess


class PylintCommand(distutils.cmd.Command):
  """A custom command to run Pylint on all Python source files."""

  description = 'run Pylint on Python source files'
  user_options = [
      # The format is (long option, short option, description).
      ('pylint-rcfile=', None, 'path to Pylint config file'),
  ]

  def initialize_options(self):
    """Set default values for options."""
    # Each user option must be listed here with their default value.
    self.pylint_rcfile = ''

  def finalize_options(self):
    """Post-process options."""
    if self.pylint_rcfile:
      assert os.path.exists(self.pylint_rcfile), (
          'Pylint config file %s does not exist.' % self.pylint_rcfile)

  def run(self):
    """Run command."""
    command = ['/usr/bin/pylint']
    if self.pylint_rcfile:
      command.append('--rcfile=%s' % self.pylint_rcfile)
    command.append(os.getcwd())
    self.announce(
        'Running command: %s' % str(command),
        level=distutils.log.INFO)
    subprocess.check_call(command)


setuptools.setup(
    cmdclass={
        'pylint': PylintCommand,
    },
    # Usual setup() args.
    # ...
)

Now, running python setup.py --help-commands will show:

Standard commands:
  ...
Extra commands:
  pylint: run Pylint on Python source files
  ...

We can now run the command we just defined with:

$ python setup.py pylint

…or with a custom option:

$ python setup.py pylint --pylint-rcfile=.pylintrc

To learn more, you can check out documentation on inheriting from distutils.cmd.Command as well as the source code of some built-in commands, such as build_py.

Adding Custom Steps to setup.py build

Let’s say we are really paranoid about code style and we’d like to run Pylint as part of setup.py build. We can do this in the following manner:

  1. Create a subclass of setuptools.command.build_py.build_py (or distutils.command.build_py.build_py if using distutils) that invokes our new Pylint command;

  2. Add the newly defined command class to the cmdclass argument to setup().

For example, we can implement the following in our setup.py:

import setuptools.command.build_py


class BuildPyCommand(setuptools.command.build_py.build_py):
  """Custom build command."""

  def run(self):
    self.run_command('pylint')
    setuptools.command.build_py.build_py.run(self)


setuptools.setup(
    cmdclass={
        'pylint': PylintCommand,
        'build_py': BuildPyCommand,
    },
    # Usual setup() args.
    # ...
)

For more examples, I encourage you to check out the setuptools source code.

]]>
<![CDATA[How X Window Managers Work, And How To Write One (Part II)]]>

In Part I of this series, we examined the role of X window managers in a modern Linux/BSD desktop environment, and how they interact with the X server and applications. In Part II, we will dig into the dirty details and walk through the code of an example reparenting

]]>
https://jichu4n.com/posts/how-x-window-managers-work-and-how-to-write-one-part-ii/5bc6d9a160307a000159bd47Mon, 09 Jun 2014 06:41:00 GMT

In Part I of this series, we examined the role of X window managers in a modern Linux/BSD desktop environment, and how they interact with the X server and applications. In Part II, we will dig into the dirty details and walk through the code of an example reparenting non-compositing window manager, basic_wm.

Introduction

Before we start with the code, let’s go over a couple of basic implementation choices such as language and API.

Language

You can write a window manager in Haskell, Python, Lisp, Go, Java, or any other language that has X bindings, i.e. a library for communicating with X servers.

I chose C++ for basic_wm, our example window manager, mainly because the C libraries for X11 are the best documented. In addition to books such as the Xlib Programming Manual, documentation can be found in the form of widely available man pages (e.g., try man XOpenDisplay at a terminal). Example usage and common patterns abound in the source code of many great window managers written in the past three decades.

We will use C++11 and C++14 features where convenient, so you will need a compatible compiler (GCC 4.9 or higher, or Clang 3.4 or higher) if you want to play with the example source code.

A Tale of Two X Libraries

There are two official C libraries for X: Xlib and XCB. Xlib, hailing from 1985, was the original X client library, and was the only official X client library until the introduction of XCB in 2001. The two libraries have very different philosophies: whereas Xlib tries to hide the X protocol behind a friendly C API with lots of bells and whistles, XCB directly exposes the plumbing beneath.

In practice, this different manifests itself most prominently in how the two libraries handle the fundamental asynchronous nature of X’s client-server architecture. Xlib attempts to hide the asynchronous X protocol behind a mixed synchronous and asynchronous API, whereas XCB exposes a fully asynchronous API.

For example, to lookup the attributes (e.g., size and position) of a window, you would write the following code using Xlib:

XWindowAttributes attrs;
XGetWindowAttributes(display, window, &attrs);
// Do stuff.

Under the hood, XGetWindowAttributes() sends a request to the X server and blocks until it receives a response; in other words, it is synchronous. On the other hand, using XCB, you would write this instead:

xcb_get_window_attributes_cookie_t cookie =
    xcb_get_window_attributes(
        connection, window);
// Do other stuff while waiting for reply.
xcb_get_window_attributes_reply_t* reply =
    xcb_get_window_attributes_reply(
        connection, cookie, nullptr);
// Do stuff.
free(reply);

The function xcb_get_window_attributes merely sends the request to the X server, and returns immediately without waiting for the reply; in other words, it is asynchronous. The client program must subsequently call xcb_get_window_attributes_reply to block on the response.

The advantage of the asynchronous approach is obvious if we consider an example where we need to retrieve the attributes of, say, 5 windows at once. Using XCB, we can immediately fire off all 5 requests to the X server, and then wait for all of them to return. With Xlib, we have send one request at a time and wait for its response to come back before we can send the next request. Therefore, we’d expect to only block for the duration of one round-trip to the X server using XCB, compared to 5 with Xlib.

The downside of XCB’s fully asynchronous approach is verbosity and a less programmer-friendly interface. The Xlib code above looks like your average C library call; the XCB code above is significantly more involved.

However, it is important to note that Xlib isn’t fully synchronous. Rather, Xlib has a mixture of synchronous and asynchronous APIs. In general, functions that do not return values (e.g., XResizeWindow, which changes the size of a window) are asynchronous, while functions that return values (e.g., XGetGeometry, which return the size and position of a window) are synchronous:

Xlib saves up requests instead of sending them to the server immediately, so that the client program can continue running instead of waiting to gain access to the network after every Xlib call. This is possible because most Xlib calls do not require immediate action by the server. This grouping of requests by the client before sending them over the network also increases the performance of most networks, because it makes the network transactions longer and less numerous, reducing the total overhead involved.

Xlib sends the buffer full of requests to the server under three conditions.

The most common is when an application calls an Xlib routine to wait for an event but no matching event is currently available on Xlib’s queue. Since, in this case, the application must wait for an appropriate event anyway, it makes sense to flush the request buffer.

Second, Xlib calls that get information from the server require a reply before the program can continue, and therefore, the request buffer is sent and all the requests acted on before the information is returned.

Third, the client would like to be able to flush the request buffer manually in situations where no user events and no calls to query the server are expected. One good example of this third case is an animated game, where the display changes even when there is no user input.

— Xlib Programming Manual §2.1.2

This is the most confusing aspect of Xlib, and a source of endless frustration for those new to X programming. One of the major motivations for the creation of XCB was to eliminate this complexity.

Many popular window managers have already been ported to XCB from Xlib for the performance benefits. If you are interested, you can read up on how the Awesome and KWin window managers were ported to XCB.

I chose to use Xlib for basic_wm, however, because as a pedagogical example, readability and simplicity is much more important than performance. In fact, I would recommend starting with Xlib first for any project and worry about porting to XCB later, as Xlib is much easier to learn and prototype with.

While an in-depth discussion of the merits of Xlib and XCB is beyond the scope of this discussion, I do recommend you check out the official article on Xlib vs. XCB as it presents a fascinating case study of API design.

Dependencies and Building

Firstly, you will need Xlib development headers in order to compile against Xlib. They are available on Debian/Ubuntu as libx11-dev, on Fedora as libX11-devel, and on Arch Linux as part of libx11.

The only additional library used by the example basic_wm code is google-glog, Google’s open source C++ logging library. It is available on Debian/Ubuntu as libgoogle-glog-dev, on Fedora as glog-devel, and on Arch Linux as google-glog.

The recommended way to build the source code is with GNU Make: just run make in the source directory. Alternatively, g++ *.cpp will also do the trick if you supply all the libraries correctly.

To test the window manager, you will likely need Xephyr along with a couple of simple X programs such as xeyes or xterm.

Step 1: Setup and Teardown

Let’s start off with a skeleton implementation of the WindowManager class, which will encapsulate all the window management logic in our example. All it will do for now is set up a connection to the X server on construction, and close that connection on destruction.

In window_manager.hpp:

extern "C" {
#include <X11/Xlib.h>
}
#include <memory>

class WindowManager {
 public:
  // Factory method for establishing a connection to an X server and creating a
  // WindowManager instance.
  static ::std::unique_ptr<WindowManager> Create();
  // Disconnects from the X server.
  ~WindowManager();
  // The entry point to this class. Enters the main event loop.
  void Run();

 private:
  // Invoked internally by Create().
  WindowManager(Display* display);

  // Handle to the underlying Xlib Display struct.
  Display* display_;
  // Handle to root window.
  const Window root_;
};

In window_manager.cpp:

#include "window_manager.hpp"
#include <glog/logging.h>

using ::std::unique_ptr;

unique_ptr<WindowManager> WindowManager::Create() {
  // 1. Open X display.
  Display* display = XOpenDisplay(nullptr);
  if (display == nullptr) {
    LOG(ERROR) << "Failed to open X display " << XDisplayName(nullptr);
    return nullptr;
  }
  // 2. Construct WindowManager instance.
  return unique_ptr<WindowManager>(new WindowManager(display));
}

WindowManager::WindowManager(Display* display)
    : display_(CHECK_NOTNULL(display)),
      root_(DefaultRootWindow(display_)) {
}

WindowManager::~WindowManager() {
  XCloseDisplay(display_);
}

void WindowManager::Run() { /* TODO */ }

The main function in main.cpp:

#include <cstdlib>
#include <glog/logging.h>
#include "window_manager.hpp"

using ::std::unique_ptr;

int main(int argc, char** argv) {
  ::google::InitGoogleLogging(argv[0]);

  unique_ptr<WindowManager> window_manager(WindowManager::Create());
  if (!window_manager) {
    LOG(ERROR) << "Failed to initialize window manager.";
    return EXIT_FAILURE;
  }

  window_manager->Run();

  return EXIT_SUCCESS;
}

Even if you have never programmed Xlib before, this should not be hard to understand. WindowManager::Create() is a static factory method that sets up a connection to an X server via XOpenDisplay(); we will let XOpenDisplay() figure out which X server to connect to from the DISPLAY environment variable. The connection is represented by the opaque Display structure. We call XCloseDisplay() on the saved Display* in the destructor to close the connection.

The other function of note is DefaultRootWindow(), which returns the default root window for a given X server. Technically, an X server may have several root windows in some rare multihead setups, but let’s not worry about that here.

If you run this program now, it should connect to the X server, close the connection, and exit. Hooray!

Step 2: Initialization

Now, let’s dig into the mysterious Run() function above. We’ll start with the initialization steps required after opening an X server connection. In window_manager.hpp:

class WindowManager {
  ...
  // Xlib error handler. It must be static as its address is passed to Xlib.
  static int OnXError(Display* display, XErrorEvent* e);
  // Xlib error handler used to determine whether another window manager is
  // running. It is set as the error handler right before selecting substructure
  // redirection mask on the root window, so it is invoked if and only if
  // another window manager is running. It must be static as its address is
  // passed to Xlib.
  static int OnWMDetected(Display* display, XErrorEvent* e);
  // Whether an existing window manager has been detected. Set by OnWMDetected,
  // and hence must be static.
  static bool wm_detected_;
};

In window_manager.cpp:

void WindowManager::Run() {
  // 1. Initialization.
  //   a. Select events on root window. Use a special error handler so we can
  //   exit gracefully if another window manager is already running.
  wm_detected_ = false;
  XSetErrorHandler(&WindowManager::OnWMDetected);
  XSelectInput(
      display_,
      root_,
      SubstructureRedirectMask | SubstructureNotifyMask);
  XSync(display_, false);
  if (wm_detected_) {
    LOG(ERROR) << "Detected another window manager on display "
               << XDisplayString(display_);
    return;
  }
  //   b. Set error handler.
  XSetErrorHandler(&WindowManager::OnXError);

  // 2. Main event loop.
  ...
}

int WindowManager::OnWMDetected(Display* display, XErrorEvent* e) {
  // In the case of an already running window manager, the error code from
  // XSelectInput is BadAccess. We don't expect this handler to receive any
  // other errors.
  CHECK_EQ(static_cast<int>(e->error_code), BadAccess);
  // Set flag.
  wm_detected_ = true;
  // The return value is ignored.
  return 0;
}

int WindowManager::OnXError(Display* display, XErrorEvent* e) { /* Print e */ }

We first select substructure redirection and substructure notify events on the root window. This is discussed in more detail in the Substructure Redirection section in Part I; to recap, this allows the window manager to intercept requests from top level windows, and subscribe to events concerning the same. Only one X client can select substructure redirection on the root window at any given time; the second client to attempt to do so will get a BadAccess error.

Catching this error is somewhat tricky, however. XSelectInput, like all asynchronous Xlib functions, does not actually send a request to the X server, but instead only queues the request and returns. Hence, we have to explicitly flush the request queue with XSync (see our discussion above in A Tale of Two X Libraries). We set up a temporary error handler, OnWMDetected, to catch errors during this XSync invocation.

Next, we set up our regular error handler which will be invoked for any future errors. Our implementation, which logs the error and continues, will be an important debugging aid as we implement and test our window manager. I will not show it here for the sake of brevity; for reference, check it out in window_manager.cpp.

Step 3: The Event Loop

Now let’s add to Run() method above the signature construct of every modern GUI program - the event loop. In window_manager.cpp:

void WindowManager::Run() {
  // 1. Initialization.
  ...

  // 2. Main event loop.
  for (;;) {
    // 1. Get next event.
    XEvent e;
    XNextEvent(display_, &e);
    LOG(INFO) << "Received event: " << ToString(e);

    // 2. Dispatch event.
    switch (e.type) {
      case CreateNotify:
        OnCreateNotify(e.xcreatewindow);
        break;
      case DestroyNotify:
        OnDestroyNotify(e.xdestroywindow);
        break;
      case ReparentNotify:
        OnReparentNotify(e.xreparent);
        break;
      ...
      // etc. etc.
      ...
      default:
        LOG(WARNING) << "Ignored event";
    }
  }
}

If you have done low-level GUI programming before, this should look very familiar. We sit in an event loop and repeatedly fetch the next event with XNextEvent() and dispatch it to the appropriate handlers.

The structure of the XEvent type is typical of a polymorphic C structure. Each type of event carries different attributes and corresponds to an event struct, such as XKeyEvent, XButtonEvent, and XConfigureEvent. The first field of each struct is always int type. The XEvent type is a C union of all the event structs plus int type:

typedef struct _XKeyEvent {
  int type;
  // Fields specific to XKeyEvent.
  ...
} XKeyEvent;

typedef struct _XButtonEvent {
  int type;
  // Fields specific to XButtonEvent.
  ...
} XButtonEvent;

// etc.
...

typedef union _XEvent {
  int type;
  XKeyEvent xkey;
  XButtonEvent xbutton;
  // etc.
  ...
} XEvent;

This way, the type is always available regardless of the type of event and requires no additional storage. The same pattern can be observed in GTK+/GLib, Python’s C API, and many other object-oriented C APIs.

In basic_wm, the event handlers follow the naming convention of OnFoo(), where Foo is the type of the event, so it should be straightforward to figure out who does what.

What’s Next

We now have a basic skeleton for our window manager, and we can start filling in the meat - the event handlers. The million-dollar question is, what events does a window manager handle, and what should it do with them?

In the next installment in this series, we’ll answer that question by diving into the complex ways window managers, clients and the user interact with each other via X events. In the meantime, you’re more than welcome to check out the code for basic_wm on GitHub.

Next chapter: How X Window Managers Work, And How To Write One (Part III)

]]>
<![CDATA[DEBUG trap and PROMPT_COMMAND in Bash]]>

Update 03/08/2016: A patch by Dan Stromberg adds a PS0 variable to Bash that greatly simplifies what’s described in this article. This patch will likely be merged into Bash 4.4. Please refer to his post for details.

The DEBUG trap

The DEBUG trap is an

]]>
https://jichu4n.com/posts/debug-trap-and-prompt_command-in-bash/5bc6e11f60307a000159bd91Sun, 08 Jun 2014 07:13:00 GMT

Update 03/08/2016: A patch by Dan Stromberg adds a PS0 variable to Bash that greatly simplifies what’s described in this article. This patch will likely be merged into Bash 4.4. Please refer to his post for details.

The DEBUG trap

The DEBUG trap is an extremely handy feature of Bash. The idea is pretty straightforward: if you run

trap "echo Hello" DEBUG

then Bash will run echo Hello before it executes each subsequent command. For example:

~/Scratch $ ls
Hello
file1 file2
~/Scratch $ echo Bye
Hello
Bye

A caveat, however, is that the DEBUG trap is triggered once per simple command; if you have command lists or control structures, the trap will be triggered multiple times. For example, using the setup above:

~/Scratch $ echo 1 && echo 2; echo 3
Hello
1
Hello
2
Hello
3
~/Scratch $ if [ -e /etc/passwd ]; then echo "/etc/passwd exists"; fi
Hello
Hello
/etc/passwd exists

What if we only want to run a command once per composite command, like the preexec hook in zsh?

Enter PROMPT_COMMAND.

PROMPT_COMMAND

The idea behind PROMPT_COMMAND is also very simple: if you run

PROMPT_COMMAND="echo Bye"

then Bash will execute echo Bye before it prints each subsequent prompt (i.e., after it has finished executing the previous command line). For example, using the setup above:

~/Scratch $ echo 1; echo 2
Hello
1
Hello
2
Hello
Bye

Note that the DEBUG trap is triggered again for PROMPT_COMMAND, in addition to the user-supplied commands.

Combining the DEBUG trap and PROMPT_COMMAND

By combining the DEBUG trap and PROMPT_COMMAND, we can now hack Bash to run some code right before and right after executing a full command. For example, try adding this to your ~/.bashrc:

# This will run before any command is executed.
function PreCommand() {
  if [ -z "$AT_PROMPT" ]; then
    return
  fi
  unset AT_PROMPT

  # Do stuff.
  echo "Running PreCommand"
}
trap "PreCommand" DEBUG

# This will run after the execution of the previous full command line.  We don't
# want it PostCommand to execute when first starting a bash session (i.e., at
# the first prompt).
FIRST_PROMPT=1
function PostCommand() {
  AT_PROMPT=1

  if [ -n "$FIRST_PROMPT" ]; then
    unset FIRST_PROMPT
    return
  fi

  # Do stuff.
  echo "Running PostCommand"
}
PROMPT_COMMAND="PostCommand"

The result:

~/Scratch $ echo 1; echo 2 && echo 3
Running PreCommand
1
2
3
Running PostCommand

This gives rise to some neat applications, such as a command timer script I wrote that prints out the execution time of each command:

bash_command_timer_screenshot-1

Please feel free to check it out on GitHub :)

Happy Bash hacking!

]]>
<![CDATA[How X Window Managers Work, And How To Write One (Part I)]]>

Window managers are one of the core components of the modern Linux/BSD desktop. It is not an exaggeration to say that they define to a large degree our day-to-day user experience, as they are responsible for deciding how individual windows look, move around, react to input, and organize themselves.

]]>
https://jichu4n.com/posts/how-x-window-managers-work-and-how-to-write-one-part-i/5bc6cde4f21e630001a84ab4Fri, 11 Apr 2014 05:54:00 GMT

Window managers are one of the core components of the modern Linux/BSD desktop. It is not an exaggeration to say that they define to a large degree our day-to-day user experience, as they are responsible for deciding how individual windows look, move around, react to input, and organize themselves. Hence, almost 30 years since the first X window manager, we still argue over the merits of different window managers, and new window managers continue to reinvent how we interact with our digital world.

In this series of posts, I hope to demystify how window managers work, and how you might go about writing one yourself.

I will be quoting quite heavily from the seminal Xlib Programming Manual (3rd Ed, 1994) by Adrian Nye and published by O’Reilly. Despite its age, it remains amazingly relevant and is the best available introductory text to the internals of X, which has not changed over the past two decades as much as you’d think. Since you could buy the book plus shipping for less than the price of a cup of coffee, I strongly recommend it to anyone interested in learning more about X. In addition, its chapter 16 also covers the basics of window management.

The Role of an X Window Manager

Let’s start with an examination of the role of the window manager in a modern Linux/BSD desktop environment.

The Rights of X Window Managers

Unlike other windowing systems such as Microsoft Windows or Mac OS X, X does not dictate a window manager or how a window manager should behave. This decision is to thank for the wild diversity of X window managers we see today.

X is somewhat unusual in that it does not mandate a particular type of window manager. Its developers have tried to make X itself as free of window management or user interface policy as possible.

— Xlib Programming Manual §1.2.3

In fact, it does not even require a window manager to be present at all:

Unlike citizens, the window manager has rights but not responsibilities. Programs must be prepared to cooperate with any type of window manager or with none at all […].

— Xlib Programming Manual §1.2.3

This is in stark contrast to the integrative approach of other GUI systems. On Mac OS X and Unity, for example, an application could not possibly function without the window manager, as the latter is responsible for rendering a part of the application’s interface (e.g., menus).

The Responsibilities of X Window Managers

As you probably already know, X operates in a server-client model. An X server controls one or more physical display devices as well as input devices (mouse, keyboard, etc.). An application that wants to interact with these devices assumes the role of an X client. An X server and its clients may run on the same computer, in which case they communicate via domain sockets, or on different computers, in which case they communicate through TCP/IP.

A window manager is a regular X client. It doesn’t have any superuser privileges or keys to kernel backdoors; it is a normal user process that is allowed by the X server to call a set of special APIs. X ensures that no more than one window manager is running at any given point by denying a client access to these APIs if another client currently has access. The first client to attempt to access these APIs always succeeds.

A window manager communicates with the windows it manages through two X mechanisms: properties and events. We will discuss these in detail in later sections, but the takeaway is that the communication happens through the X server, not directly between the window manager and other applications.

This is illustrated by the following diagram:

so1jXbe2d2Vvx917pbA5Cjw

How an X Window Manager Manages Windows

Let’s now dive into the details of how a window manager does its job.

The Window Hierarchy

When we think about modern GUIs, we usually use the term widgets or controls to refer to UI elements such as buttons, scrollbars, or text boxes, and the term windows to refer to a container for such widgets that has its own name and can be independently moved around, closed, resized, etc..

X, however, was designed to be as low-level as possible. The fundamental UI model that X provides, upon which UI frameworks such as GTK+ and Qt are built, is that of an hierarchy of rectangles. In X terminology, all top level windows and all UI elements within are windows. In other words, a window, is any rectangular area that is an unit of user interaction and/or graphic display.

Windows are organized into a tree hierarchy. At the root of the hierarchy is the root window, a virtual, invisible window that has the same size as the screen, and is always present. Top level windows are direct children of the root window. UI elements within a top level window are descendants of that window.

wm_sample_dialog

For example, consider the dialog box above from the Xfce desktop environment. The entire dialog is an X window. All UI elements in the dialog box - the magnifying glass icon, the text box, the green down arrow, the Close and Launch buttons, and the icons inside those buttons - are also X _window_s.

The whole dialog window is a child of the root window. The magnifying glass icon, the text box, and the Close and Launch buttons are children of the dialog window. The green down arrow is a child of the text box window, and the icons in the Close and Launch buttons are children of those buttons respectively.

An important thing to note about X windows is that a child window is clipped to the boundaries of its parent:

A child may be positioned partially or completely outside its parent window, but output to the child is displayed and input received only in the area where the child overlaps with the parent.

— Xlib Programming Manual §2.2.2

For example, if we increase the width of the text box in the dialog above by 2x without changing the size of the dialog box, the portion of the text box that extends outside of the dialog box will become invisible, and clicking on it will not send an event to the text box.

A window manager manages top level windows - that is, direct children of the root window.

Substructure Redirection

In the absence of a window manager, when an application wants to do something with a window - move it, resize it, show/hide it, etc. - its request is directly processed by the X server, and that’s the end of that. A window manager, however, needs to intercept these requests. For example, a window manager may need to know that a new top level window has been created and displayed, in order to draw window decorations (e.g. minimize / maximize / close buttons) around it. It may also need to know that an existing top level window has been resized, in order to redraw the window decorations to reflect the change.

The mechanism that allows a window manager to intercept such requests is called substructure redirection.

This is how substructure redirection works. Suppose we have a window W. If a program M registers for substructure redirection on W, a matching request to modify any direct child window of W will not be executed by the X server. Instead, the X server redirects this request to the program M, which can do whatever it wants with the request, including denying the request outright or granting the request with modifications. More formally,

The structure, as the term is used here, is the location, size, stacking order, border width, and mapping status of a window. The substructure is all these statistics about the children of a particular window. This is the complete set of information about screen layout that the window manager might need in order to implement its policy. Redirection means that an event is sent to the client selecting redirection (usually the window manager), and the original structure−changing request is not executed.

— Xlib Programming Manual §16.2

Note that only direct children of a window W is affected by substructure redirection on W, not any windows further down the hierarchy.

This gets interesting when we consider substructure redirection on the root window:

When the window manager selects SubstructureRedirectMask on the root window, an attempt by any other client to change the configuration of any child of the root window will fail. Instead an event describing the layout change request will be sent to the window manager. The window manager then reads the event and determines whether to honor the request, modify it, or deny it completely. If it decides to honor the request, it calls the routine that the client called that triggered the event with the same arguments. If it decides to modify the request, it calls the same routine but with modified arguments.

— Xlib Programming Manual §16.2

In other words, a window manager must register for substructure redirection on the root window, which causes all creation, destruction, reconfiguration etc. of top level windows - which are direct children of the root window - to be routed to the window manager. This is the magic hook into the X server that window managers rely on to do their job.

This relationship is shown in the following diagram:

sTifV_OHk9dsZi6cwbLS7qQ

Finally, the X server only allows one running program to register for substructure redirection on any given window at any given time. An attempt to register for substructure redirection on a window will fail if another X client has already done the same on the same window, and has not unregistered, disconnected from the X server, or crashed. Since all window managers must register for substructure redirection on the root window, this latter acts as a locking mechanism that prevents two or more window managers from running simultaneously on the same screen.

Reparenting

In the example dialog box above, we see a title bar with, for example, little buttons for minimizing, maximizing, and closing the window. These UI elements are not created by the application, but by the window manager, via a process known as reparenting or framing:

A window manager can decorate [top level] windows on the screen with titlebars and place little boxes on the titlebar with which the window can be moved or resized. This is only one possibility […].

To do this, the window manager creates a child of the root somewhat larger than the top level window of the application. Then it calls XReparentWindow(), specifying the top level window of the application as win and the new parent [window it just created] as parent. win and all its descendants will then be descendants of parent.

— Xlib Programming Manual §16.3

In other words, if we were to run an X application without a window manager present, the top level window of the application would be a direct child of the root window. With a window manager running, however, the top level window of the application may be reparented by the window manager; it becomes a child of a frame window which is created by the window manager, and which is itself a direct child of the root window. The window manager can add other UI elements inside this frame window alongside the application’s top level window as it sees fit.

Therefore, I’ve kind of lied to you several paragraphs ago: the dialog box shown earlier is really a child window within a frame window created by Xfce's window manager, Xfwm, along with other UI elements for window management:

snfQMUpwYCAfp6q_mmpEVoQ

Reparenting is what allows different window managers to draw different window decorations, and thereby achieve a consistent look-and-feel across windows. However, there are also window managers that do not reparent at all: these are called non-reparenting window managers. There are two reasons why a window manager would not want to reparent:

  1. If a window manager does not draw window decorations around top level windows , it obviously has no need to reparent them. Examples: xmonad, dwm.

  2. Compositing window managers do not always need to reparent windows; we will discuss why below. Example: Compiz. This is not true for all compositing window managers, however; for example, GNOME’s default window manager, Mutter, is a reparenting comopositing window manager.

Let’s now consider substructure redirection in the context of reparenting. When a top level window W is first shown (map'ped in X jargon), the window manager is notified because it has registered for substructure redirection on the root window, and a top level window is a direct child of the root window. It then creates a frame F and reparents W, so that W becomes a child of F, which itself is a child of the root window. But since now W is no longer a direct child of the root window, the window manager will no longer be able to intercept changes to W!

Therefore, a reparenting window manager must also subsequently register for substructure redirection on each frame window it creates.

Compositing

Compositing window managers are a relatively new development. Compositing support in X was added in late 2004, a full decade after the last edition of Xlib Programming Manual. The first compositing window managers, Xfwm and Compiz, launched in early 2005.

So, what exactly does a compositing window manager do?

In our discussion above on substructure redirection and reparenting, we saw how a window manager can respond to various requests for a top level window - to display/hide it (map/unmap in X jargon), to resize it, to move it, etc.. But we didn’t talk about how to deal with what’s inside the top level windows.

Indeed, from the perspective of the window manager, top level windows are black boxes; they each manage their own descendant windows (UI elements), perhaps through a framework such as GTK+ or Qt, and the window manager has no right to interfere there. The application that creates a top level window is responsible for rendering and handling events for any descendant windows (UI elements), and does so directly through X. This is shown in the first diagram above.

As the computing power of graphics hardware grew, so did people’s expectations from their window managers. With hardware acceleration, it became possible to build much more computationally intensive user interfaces, such as the (in)famous Desktop Cube in Compiz:

wm_compiz

or the Shift Switcher:

wm_compiz_2

Let’s take a moment to think about how we can implement an interface such as the Shift Switcher above. When the user triggers this interface, we need to:

  1. Render each top level window and all its descendant windows (UI elements) to an off-screen, in-memory buffer, instead of directly to the hardware.

  2. Transform (rotate, contort, etc.) each buffer according to our design.

  3. Composite the transformed buffers into a final buffer along with a background and any other floating UI elements else we need to display.

  4. Create an overlay window that covers the entire screen and hides all other windows.

  5. Render the final buffer into the overlay window.

There are a number of challenges:

  • We must be able to retrieve the displayed contents of top level windows. However, as we described earlier, top level windows render their contents directly through X, without going through the window manager.

  • We need to update our interface in real time as the contents of the top level windows change. However, top level windows do not notify window managers when their contents change, again because they render their contents directly through X.

  • A top level window A may overlap with another top level window B below, which means a portion of B isn’t currently displayed. Our interface, however, must capture the full rendering of A and B, regardless of overlapping regions.

  • All this complex compositing process is computationally intensive and requires hardware acceleration to function adequately.

It is clear that none of this would be possible without some heavy cooperation from the X server. Enter the Composite extension:

Many user interface operations would benefit from having pixel contents of window hierarchies available without respect to sibling and antecedent clipping. In addition, placing control over the composition of these pixel contents into a final screen image in an external application will enable a flexible system for dynamic application content presentation.

— X Composite Extension

The Composite extension provides a mechanism to request the X server not to render a specific window and its descendants directly to hardware, but to a special buffer maintained by the X server, and do so without the normal clipping and overlap computations. This buffer can then be read and used by the client that made the request.

That’s exactly what a compositing window manager does: it will ask X to render each top level window to an off-screen, in-memory buffer and composite the results into an overlay window itself. And it needs to do this not just for fancy task switcher interfaces as in our example, but also to achieve effects like translucency, animations, soft shadows, and the like.

This is illustrated in the following diagram:

sawR1epQiKxE5bX3Cplms2A

Let’s end this section by considering whether a compositing window manager should reparent top level windows.

Since a compositing window manager already knows the size and position of all top level windows, it’s easy for it to just draw window decorations during compositing into the overlay window using graphics operations (e.g. OpenGL), without ever creating an actual X frame window and reparenting. Some compositing window managers do operate this way.

On the other hand, a window manager may need to support both a compositing and a non-compositing mode, for compatibility with older or unsupported graphics hardware. In this case, it needs to implement reparenting and frame windows for non-compositing mode anyway, so additionally implementing drawing window decorations using graphics operations becomes redundant. This is why may other compositing window managers still choose to reparent.

Ready For Some Code?

If you’ve read everything up to this point, you’re probably holding back the urge to cry out "Enough talk - show me some code!" I don’t blame you.

In the next installment in this series, I will walk you through a basic implementation of a reparenting, non-compositing window manager. Impatient? Check out the code on GitHub!

Next chapter: How X Window Managers Work, And How To Write One (Part II)

]]>
<![CDATA[The Most Popular Fonts On The Web: A Study]]>

If you’ve ever worked on a web site, you already know that choosing the right fonts is one of the most important aesthetic decisions in the design of a site. But, like all aesthetic decisions, it is a highly subjective.

I decided to try to bring a little

]]>
https://jichu4n.com/posts/the-most-popular-fonts-on-the-web-a-study/5bc6e6bb60307a000159bdbcFri, 04 Apr 2014 07:37:00 GMT

If you’ve ever worked on a web site, you already know that choosing the right fonts is one of the most important aesthetic decisions in the design of a site. But, like all aesthetic decisions, it is a highly subjective.

I decided to try to bring a little bit of objectivity into the equation by finding out, empirically, what fonts the most popular sites on the web are using today.

Methodology

I wrote a Python program that crawls the front page of the 100,000 most popular web sites according to Alexa’s top sites list. It parses the HTML using BeautifulSoup, and parses all in-line and linked CSS stylesheets using cssutils. It then looks for font and font-family rules in the CSS rules, and stores the normalized form of each font in those rules in order. The result is kept in a SQLite database for analysis.

This all sounds pretty straightforward, but in fact it took me two weeks to build a crawler that doesn’t choke on all the crazy crap people throw at browsers. I will write up some of the more interesting cases in a separate post.

The final crawl success rate was about 96%. The size of all HTML and stylesheets downloaded was about 30GB.

The Most Popular First-Choice Fonts

First, let’s take a look at the first-choice/most preferred fonts, i.e., the ones that appear first in font-family rules. These fonts most closely represent the intention of the web site designers without compatibility compromises.

For each font, we calculate the number of distinct web sites that have at least one CSS font or font-family rule that lists the font as the first choice. This means that if a site has two rules, one listing Arial and another listing Times, it will count towards both. Thus, the numbers add up to much higher than the total number of sites.

Without further ado, here’s the breakdown of the top fonts on the web:

(The chart is interactive - hover/click to see actual numbers.)

A couple of observations:

  • Sans-serif fonts dominate the web. The top 25 fonts list only includes 4 serif fonts, compared to 16 sans-serif fonts. The most popular serif-font, Georgia, ranks #4 on the list with about 20% of sites.

  • Monospace fonts get no love. Most sites don’t bother specifying a custom monospace font; the most common monospace font specification just uses the browser default (12%). The most popular monospace font, Monaco, is featured on 7.2% of sites. Both of these are quite high, in fact, considering that we only crawl the front page of these sites.

  • The most popular fonts of each family:

    • Sans-serif: Arial (#1, 62%)

    • Serif: Georgia (#4, 20%)

    • Monospace: Monaco (#11, 7.2%)

  • Old Microsoft fonts still reign over the web. The top non-Microsoft font, Helvetica Neue, ranks #6 with an impressive 18% of top web sites.

  • Font Awesome is awesome. About 4.6% of the top 100K sites already use it for universal icons.

  • The Chinese web is on the rise. 3 out of the top 25 fonts are Chinese fonts, compared to 21 Latin fonts (and 1 symbol font). No other scripts made their way into the list.

What If We Considered…

  • Fonts in all positions: (graph) not much difference.

  • The top 1,000 web sites only: (graph) even more Arial (#1, 74%).

The Most Popular Heading Fonts

Next, let’s look at first-choice fonts used in headings or titles.

I used a really crude metric to determine whether a CSS rule matches a heading: whether the CSS selector is for an H1H6 tag or contains the strings "heading" or "title". While this finds only a subset of actual headings, it is not a bad approximation as it matches rules on about 58% of the top 100,000 web sites.

Let’s start with the graph:

Observations:

  • Header fonts are more diverse. While Microsoft fonts still reign supreme, a number or more exotic fonts are also on the list. In terms of distribution, there’s a much longer tail. It may be a sign that designers pay a lot more attention to fonts used in headings.

  • Serif fonts are more popular in headings than elsewhere. While Arial still claims to top spot with 27.31% of sites, the top serif font, Georgia, rises to 2nd place with 9% of sites.

  • No monospace fonts in headings. Not surprising.

  • Among Chinese fonts, 宋体 is more commonly used for body text, while 雅黑 (or Yahei) is more commonly used for heading text.

Concluding Thoughts

Arial and friends are the most hated fonts ever. Quoting The Scourge of Arial:

Despite its pervasiveness, a professional designer would rarely - at least for the moment - specify Arial.

And yet, Arial is still the default font on the vast majority of sites on the web, followed closely by its friends. Why?

Quoting Stop Using Arial & Helvetica:

Some people actually have a reason to use them but most use it mindlessly - just because everyone else does. Often, no thought is given to design of the site, let alone typography.

This is pretty sad. Paraphrasing Stop Using Lame Fonts, a good font stack has the potential to really make a site design shine, and it’s a shame web designers aren’t exploiting this opportunity.

Caveats

There are a couple of caveats with this dataset.

First, I am only surveying the landing page of each site. For many sites, notably those whose main interface is hidden behind a login (Facebook, Evernote, etc.), we may not be finding the styles that matter most to users. However, I figured since it would be pretty poor design to build a landing page that is aesthetically inconsistent with the rest of the site, it is not very likely that the font selection on the landing page would be too different from the fonts used elsewhere on the site. Of course, without creating fake accounts on these sites, we have no way to verify.

The numbers are not weighted by prominence on the web pages. One could argue that the font of the main body text on a page should carry more weight than that of the tiny disclaimer text at the bottom which no one reads. It would be tricky to determine what the right weight function is though.

The numbers are not weighted by the prominence of the web sites. For example, we could make it so that Google’s use of Arial would get more weight than some random obscure site’s use of Arial, as Google’s choice impacts many more users and was probably the result of deliberation by a team of expert designers. Again, it’s not entirely obvious how the weights should be assigned.

I am only looking at CSS rules in <style> tags or linked stylesheets. That means I am ignoring style="…" attributes in tags, <font> elements, or dynamically assigned fonts (i.e., through JavaScript). I would be surprised if this turns out to be a big loss though.

Next Steps

I can think of quite a few other useful questions one might find the answer to from this dataset:

  • What are the most common font pairings?

  • What are the popular choices for heading fonts, given that I’ve chosen font X as my body text font?

  • What are the most common fonts on news sites/forums/productivity web apps/social media sites?

  • Is there any correlation between font choice and site popularity?

What do you think?

Please feel free to download the top 100 first-choice fonts as a CSV file.

]]>
<![CDATA[Python: Multiprocessing and Exceptions]]>

Python’s multiprocessing module provides an interface for spawning and managing child processes that is familiar to users of the threading module. One problem with the multiprocessing module, however, is that exceptions in spawned child processes don’t print stack traces:

Consider the following snippet:

import multiprocessing
import
]]>
https://jichu4n.com/posts/python-multiprocessing-and-exceptions/5bc6da5a60307a000159bd4fTue, 11 Mar 2014 06:44:00 GMT

Python’s multiprocessing module provides an interface for spawning and managing child processes that is familiar to users of the threading module. One problem with the multiprocessing module, however, is that exceptions in spawned child processes don’t print stack traces:

Consider the following snippet:

import multiprocessing
import somelib

def f(x):
  return 1 / somelib.somefunc(x)

if __name__ == '__main__':
  with multiprocessing.Pool(5) as pool:
    print(pool.map(f, range(5)))

and the following error message:

Traceback (most recent call last):
  File "test.py", line 9, in <module>
    print(pool.map(f, range(5)))
  File "/usr/lib/python3.3/multiprocessing/pool.py", line 228, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.3/multiprocessing/pool.py", line 564, in get
    raise self._value
ZeroDivisionError: division by zero

What triggered the ZeroDivisionError? Did somelib.somefunc(x) return 0, or did some other computation in somelib.somefunc() cause the exception? You will notice that we only see the stack trace of the main process, whereas the stack trace of the code that actually triggered the exception in the worker processes is not shown at all.

Luckily, Python provides a handy traceback module for working with exceptions and stack traces. All we have to do is catch the exception inside the worker process, and print it. Let’s change the code above to read:

import multiprocessing
import traceback
import somelib

def f(x):
  try:
    return 1 / somelib.somefunc(x)
  except Exception as e:
    print('Caught exception in worker thread (x = %d):' % x)

    # This prints the type, value, and stack trace of the
    # current exception being handled.
    traceback.print_exc()

    print()
    raise e

if __name__ == '__main__':
  with multiprocessing.Pool(5) as pool:
    print(pool.map(f, range(5)))

Now, if you run the same code again, you will see something like this:

Caught exception in worker thread (x = 0):
Traceback (most recent call last):
  File "test.py", line 7, in f
    return 1 / somelib.somefunc(x)
  File "/path/to/somelib.py", line 2, in somefunc
    return 1 / x
ZeroDivisionError: division by zero

Traceback (most recent call last):
  File "test.py", line 16, in <module>
    print(pool.map(f, range(5)))
  File "/usr/lib/python3.3/multiprocessing/pool.py", line 228, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.3/multiprocessing/pool.py", line 564, in get
    raise self._value
ZeroDivisionError: division by zero

The printed traceback reveals somelib.somefunc() to be the actual culprit.

In practice, you may want to save the exception and the stack trace somewhere. For that, you can use the file argument of print_exc in combination with StringIO. For example:

import logging
import io  # Import StringIO in Python 2
...

def Work(...):
  try:
    ...
  except Exception as e:
    exc_buffer = io.StringIO()
    traceback.print_exc(file=exc_buffer)
    logging.error(
        'Uncaught exception in worker process:\n%s',
        exc_buffer.getvalue())
    raise e
]]>
<![CDATA[C++: Access Specifiers and Overriding]]>

Consider the following C++ code:

#include <iostream>
using namespace std;

class A {
 public:
  virtual void f() = 0;
};

class B: public A {
 private:
  virtual void f() {
    cout << "B::f()" << endl;
  }
};

class C: public B {
 protected:
  virtual void f() {
   cout << "C:
]]>
https://jichu4n.com/posts/c-access-specifiers-and-overriding/5bc6e5ea60307a000159bdb1Sat, 01 Mar 2014 08:33:00 GMT

Consider the following C++ code:

#include <iostream>
using namespace std;

class A {
 public:
  virtual void f() = 0;
};

class B: public A {
 private:
  virtual void f() {
    cout << "B::f()" << endl;
  }
};

class C: public B {
 protected:
  virtual void f() {
   cout << "C::f()" << endl;
  }
};

int main() {
  B *b = new B();
  // This is NOT legal:
  //   b->f();
  C *c = new C();
  // Nor is this:
  //   c->f();

  // Why is this legal?
  A *a1 = b;
  a1->f();  // Prints "B::f()"
  A *a2 = c;
  a2->f();  // Prints "C::f()"

  return 0;
}

My first reaction, and perhaps yours too, is that this code shouldn’t be legal. The base class A defines a public pure virtual method f() that must be overridden in derived classes. But the f() implementations in both derived classes are non-public; B and C actually don’t conform to the interface of the base class (unless casted to the base class), as shown in the first block in main.

It turns out, however, that in C++ a method in a derived class overrides a method in a base class regardless of the access specifiers of the two methods. In other words, it does not matter whether the method is declared in the base class as public, protected, or private, nor does it matter how the method is declared in the derived class; as long as they have the same signature, the method in the inherit class always overrides the method in the base class:

If a virtual member function vf is declared in a class Base and in a class Derived, derived directly or indirectly from Base, a member function vf with the same name, parameter-type-list (8.3.5), cv-qualification, and ref- qualifier (or absence of same) as Base::vf is declared, then Derived::vf is also virtual (whether or not it is so declared) and it overrides Base::vf.

— C++ standard §10.3.2

(Emphasis mine.)

You will notice that access specifiers — public, private, etc. — are specifically omitted from the list of criteria for determining an override relationship. Thus, the above code works exactly as if both B::f() and C::f() had been declared public.

Another consequence of this rather bizarre omission is that the language actually allows you to public-ly inherit from a base class, and yet not conform to its public interface unless casted to the base class, as illustrated in the first block in main above. Or, from the derived class’s perspective, the language allows clients to legally access methods explicitly marked private or protected through a cast to the base class.

So, accidental omission or well thought-out design decision? Any ideas?

]]>
<![CDATA[How To Set Default Fonts and Font Aliases on Linux]]>

Font Woes

It’s fairly straightforward to set the default font used in native apps on a modern Linux desktop, or the default fonts used to render web pages in your browser of choice.

But if you’re reading this, you probably know that that’s far

]]>
https://jichu4n.com/posts/how-to-set-default-fonts-and-font-aliases-on-linux/5bc6b58fa2b173000150bc2cSun, 23 Feb 2014 05:07:00 GMT

Font Woes

It’s fairly straightforward to set the default font used in native apps on a modern Linux desktop, or the default fonts used to render web pages in your browser of choice.

But if you’re reading this, you probably know that that’s far from the end of the story. You might have noticed that Firefox and Chrome rudely ignore your font settings for many websites. This is because many (if not most) popular sites, including Google, Yahoo, Facebook or GitHub, specify preferred fonts for text:

  • Google: arial, sans-serif
  • Yahoo: Helvetica Neue, Helvetica, Arial
  • Facebook: lucida grande, tahoma, verdana, arial, sans-serif
  • GitHub: Helvetica, arial, freesans, clean, sans-serif

(Retrieved on 02/21/2014)

You might immediately notice that the most commonly used fonts on these sites, Arial and Helvetica, are fonts that come bundled with Microsoft Windows, and are most likely not installed on your Linux system. In this case, what font is actually used is anyone’s guess. If they are installed (e.g., via a package like ttf-ms-fonts or directly copied from a Windows machine), well, you still probably want to display your favorite font instead :)

So, let’s find out what your default fonts and aliases are with fc-match:

for family in serif sans-serif monospace Arial Helvetica Verdana "Times New Roman" "Courier New"; do
  echo -n "$family: "
  fc-match "$family"
done

This is what I get on my machine by default:

serif: DejaVuSerif.ttf: "DejaVu Serif" "Book"
sans-serif: DejaVuSans.ttf: "DejaVu Sans" "Book"
monospace: DejaVuSansMono.ttf: "DejaVu Sans Mono" "Book"
Arial: DejaVuSans.ttf: "DejaVu Sans" "Book"
Helvetica: n019003l.pfb: "Nimbus Sans L" "Regular"
Verdana: DejaVuSans.ttf: "DejaVu Sans" "Book"
Times New Roman: DejaVuSerif.ttf: "DejaVu Serif" "Book"
Courier New: DejaVuSansMono.ttf: "DejaVu Sans Mono" "Book"

Font Configuration Files

So, assuming you’ve installed your fonts of choice (via a package, copying to /usr/share/fonts or ~/.fonts - please verify with the fc-list command), how do you set them as default in all apps and web sites?

Well, there are two places where fonts are configured: system-wide configuration resides in /etc/fonts/, and per-user configs are stored in ~/.config/fontconfig/fonts.conf (note that this used to be ~/.fonts.conf before fontconfig 2.10.1). For simplicity’s sake, we’ll do it in ~/.config/fontconfig/fonts.conf.

Let’s open up ~/.config/fontconfig/fonts.conf, or create it if it doesn’t already exist. Put the following skeleton structure in there:

<?xml version='1.0'?>
<!DOCTYPE fontconfig SYSTEM 'fonts.dtd'>
<fontconfig>
</fontconfig>

We will put all of our custom configuration between <fontconfig> and </fontconfig>.

Setting Default Fonts

First, let’s set the default serif, sans serif, and monospace fonts. I’ll use the beautiful Chrome OS fonts as an example (ttf-croscore if you’re running Arch Linux). Insert the following between <fontconfig> and </fontconfig>:

  <!-- Set preferred serif, sans serif, and monospace fonts. -->
  <alias>
    <family>serif</family>
    <prefer><family>Tinos</family></prefer>
  </alias>
  <alias>
    <family>sans-serif</family>
    <prefer><family>Arimo</family></prefer>
  </alias>
  <alias>
    <family>sans</family>
    <prefer><family>Arimo</family></prefer>
  </alias>
  <alias>
    <family>monospace</family>
    <prefer><family>Cousine</family></prefer>
  </alias>

Aliasing Microsoft Fonts

Now, we will create aliases for commonly used fonts like Arial and Helvetica, so that our favorite fonts will always be used instead of these fonts, e.g. when requested by a web site.

Insert the following between <fontconfig> and </fontconfig>, after the previous snippet:

  <!-- Aliases for commonly used MS fonts. -->
  <match>
    <test name="family"><string>Arial</string></test>
    <edit name="family" mode="assign" binding="strong">
      <string>Arimo</string>
    </edit>
  </match>
  <match>
    <test name="family"><string>Helvetica</string></test>
    <edit name="family" mode="assign" binding="strong">
      <string>Arimo</string>
    </edit>
  </match>
  <match>
    <test name="family"><string>Verdana</string></test>
    <edit name="family" mode="assign" binding="strong">
      <string>Arimo</string>
    </edit>
  </match>
  <match>
    <test name="family"><string>Tahoma</string></test>
    <edit name="family" mode="assign" binding="strong">
      <string>Arimo</string>
    </edit>
  </match>
  <match>
    <!-- Insert joke here -->
    <test name="family"><string>Comic Sans MS</string></test>
    <edit name="family" mode="assign" binding="strong">
      <string>Arimo</string>
    </edit>
  </match>
  <match>
    <test name="family"><string>Times New Roman</string></test>
    <edit name="family" mode="assign" binding="strong">
      <string>Tinos</string>
    </edit>
  </match>
  <match>
    <test name="family"><string>Times</string></test>
    <edit name="family" mode="assign" binding="strong">
      <string>Tinos</string>
    </edit>
  </match>
  <match>
    <test name="family"><string>Courier New</string></test>
    <edit name="family" mode="assign" binding="strong">
      <string>Cousine</string>
    </edit>
  </match>

Note that the Microsoft fonts are aliased directly to the our preferred substitute fonts. Aliasing to generic families (serif, sans-serif etc.) may or may not work depending on your configuration in /etc/fonts (they didn’t work for me), so it’s safer this way.

This list is of course by no means definitive; add/remove aliases as you like.

The Result

You’ll need to log out and back in for all applications to update. You should see the difference immediately:

Google search results, before (Arial):

font_config_before

Google search results, after (Arimo):

font_config_after

You can verify that the aliases have been set up correctly with fc-match:

for family in serif sans-serif monospace Arial Helvetica Verdana "Times New Roman" "Courier New"; do
  echo -n "$family: "
  fc-match "$family"
done

which should now give you something like:

serif: Tinos-Regular.ttf: "Tinos" "Regular"
sans-serif: Arimo-Regular.ttf: "Arimo" "Regular"
monospace: Cousine-Regular.ttf: "Cousine" "Regular"
Arial: Arimo-Regular.ttf: "Arimo" "Regular"
Helvetica: Arimo-Regular.ttf: "Arimo" "Regular"
Verdana: Arimo-Regular.ttf: "Arimo" "Regular"
Times New Roman: Tinos-Regular.ttf: "Tinos" "Regular"
Courier New: Cousine-Regular.ttf: "Cousine" "Regular"

Other Notes

Some existing examples you may find online show the following syntax:

<!-- Deprecated syntax -->
<match target="pattern" name="family">
  <test name="family" qual="any"><string>Arial</string></test>
  ...
</match>

This will produce an error message like

Fontconfig error: "/home/username/.config/fontconfig/fonts.conf", line 38: invalid attribute 'name'

The fix is to change <match target="pattern" name="family"> to just <match>, as shown above.

]]>
<![CDATA[Unicode I/O and Locales in Python]]>

I recently ran into a weird error when running some Python code in a chroot jail.

s = '你好'
with open('/tmp/asdf', 'w') as f:
  f.write(s)

gave me

Traceback (most recent call last):
  File "<stdin>", line 1,
]]>
https://jichu4n.com/posts/unicode-io-and-locales-in-python/5bc6e65560307a000159bdb7Thu, 20 Feb 2014 08:35:00 GMT

I recently ran into a weird error when running some Python code in a chroot jail.

s = '你好'
with open('/tmp/asdf', 'w') as f:
  f.write(s)

gave me

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.3/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)

The same happened with interprocess I/O:

with subprocess.Popen(
    '/usr/bin/cat',
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    universal_newlines=True) as proc:
  (cmd_stdout, cmd_stderr) = proc.communicate('你好')

gave me

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.3/subprocess.py", line 578, in check_output
    output, unused_err = process.communicate(timeout=timeout)
  File "/usr/lib/python3.3/subprocess.py", line 908, in communicate
    stdout = _eintr_retry_call(self.stdout.read)
  File "/usr/lib/python3.3/subprocess.py", line 479, in _eintr_retry_call
    return func(*args)
  File "/usr/lib/python3.3/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)

It turns out that Python str's are encoded to/decoded from raw bytes during I/O (print, file I/O, IPC, etc) using the default system locale encoding. The advantage is that, if your system locale is set up correctly, everything just works - there’s no explicit encoding/decoding between strings and bytes. The downside is that your Python code that runs fine on one machine can fail mysteriously on a different machine.

In my case, the chroot jail yielded:

$ locale
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=

Solution A

The simplest solution is to set the system locale, either just for the Python program or for your shell. For example,

# Run ./my_program.py with a custom LANG value.
LANG=en_US.utf-8 ./my_program.py

or

# Set locale for current shell session.
export LANG=en_US.utf-8
./my_program.py

In fact, it’s probably a good idea to add the export line to your ~/.bashrc, or follow however your Linux distro decides locales should be set.

Solution B

On the other hand, you can explicitly set the encoding used during I/O in your Python code.

For file I/O, in Python 3.x, you can set the encoding argument of open:

# Python 3.x
with open('/tmp/asdf', 'w', encoding='utf-8') as f:
  f.write('你好')

In Python 2.x, you can use codecs.open:

# Python 2.x
import codecs
with codecs.open('/tmp/asdf', 'w', encoding='utf-8') as f:
  f.write('你好')

Alternatively, you can use raw mode for file I/O:

with open('/tmp/asdf', 'wb') as f:
  f.write('你好'.encode('utf-8'))

For IPC with subprocess, you must not use universal_newlines=True, as that will always attempt to encode/decode using the system locale. Instead:

with subprocess.Popen(
    '/usr/bin/cat',
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE) as proc:
  (cmd_stdout_bytes, cmd_stderr_bytes) = proc.communicate('你好'.encode('utf-8'))
  (cmd_stdout, cmd_stderr) = (
      cmd_stdout_bytes.decode('utf-8'), cmd_stderr_bytes.decode('utf-8'))
]]>
<![CDATA[Disabling Screen-Off Animation In Android]]>

I am impressed by the amazing progress Android’s UI has made since its inception. With each iteration, likewise in software as in hardware, its usability and aesthetics have steadily improved to the point where I can now confidently buy one for my mom to replace the iPhone I

]]>
https://jichu4n.com/posts/disabling-screen-off-animation-in-android/5bc6e82f60307a000159bdc8Sat, 02 Feb 2013 08:43:00 GMT

I am impressed by the amazing progress Android’s UI has made since its inception. With each iteration, likewise in software as in hardware, its usability and aesthetics have steadily improved to the point where I can now confidently buy one for my mom to replace the iPhone I recommended to her only last year.

On the other hand, one particular "feature" I intensely dislike is the screen-off animation that emulates the brief white flash you see when an old CRT TV monitor is powering off. Introduced in Gingerbread (2.3), it seems not to bother most people as much as it bothers me, and even seems to be quite popular (e.g., this thread).

I was glad to learn though that a number of people find it as annoying as I do, for example in this thread and this bug.

As a CyanogenMod fan, I was quite happy with the straightforward checkbox in the Settings app in CyanogenMod 9 that toggled whether to play this animation. In CM10, this option was removed, but there was an easy hack involving changing window animation scale in the developer settings, as explained in the second link above.

So when I updated to CM10.1 M1, I was very unpleasantly surprised to find that this hack does not work any more either. I tried to immediately downgrade to CM10, but then discovered that doing so requires a full data wipe (of course…). A quick search pulled up this thread which proposes a simple prop edit, but the following posts in the thread suggest mixed results.

So, stuck with CM10.1, and really disgusted at this turn of events, I finally decided to put on my hax0r gloves and get this annoying "feature" out of my face once and for all.

Poking Around

I went to androidxref.com and started searching the Android source code for "screen off animation". (OK, I actually started out with grep, but it didn’t take me long to realize androidxref.com was about sixty million times faster - it’s really pretty cool.)

I quickly found why changing the window animation scale in Jelly Bean 4.1 (and CM10) disables the screen-off animation. The two snippets of of code implementing this behavior are at line 470 and line 2228 of PowerManagerService.java.

In Jelly Bean 4.2 (and CM10.1), however, these parts of the code have been completely refactored. The snippet of code that launches the screen-off animation is now found at line 704 of DisplayPowerController.java. Note that this new DisplayPowerController class has been factored out of the old PowerManagerService class, and both have moved into a separate power sub-directory.

The Hacking

Disclaimer: I am NOT responsible for anything that happens to your phone or you or your house or your relationship with your wife if you follow the instructions down here. If you don’t know what you’re doing and are scared of bricking your phone, just give up and go watch Superbowl. Please.

What did not change, I found, is that this code gets packaged into /system/framework/services.jar on the Android system. So, let’s take a look at this file as found in the CM10.1 image on my phone:

# Copy services.jar from phone to current directory.
adb pull /system/framework/services.jar
# Disassemble.
apktool d services.jar

(apktool is a tool that extracts and disassembles Android APK/JAR files, and is available for Linux, Windows and OS X.) This produces a directory, services.jar.out, which contains the disassembled code of services.jar we just pulled from a connected phone. I dived into the DisplayPowerController code and edited the else clause at line 704 to essentially just say setScreenOn(false); directly instead of running the animation first. If you look at the dissassembled code, you’d realize Dalvik bytecode is really quite readable and easy to hack, especially since the line number hints allow you to directly map a line in the Java source code to the corresponding Dalvik instructions. I just had the Java source file on androidxref.com open in a browser tab for reference and killed instructions line by line. You can download my patch here and apply it like this:

patch -d services.jar.out -Np1 < path/to/disable_screen_off_animation.patch

(For CM10.1 nightlies (after the MR1.1 merge) and CM10.1 M2, use this patch instead. The same piece of code was shuffled around to line 777. For CM10.1 M3, use this patch instead.)

Now re-assemble the code and push it back on to the device:

# Re-assemble modified sources as services-mod.jar.
apktool b services.jar.out services-mod.jar
# Copy services-mod.jar to /sdcard/ on phone.
adb push services-mod.jar /sdcard/
# Remount /system partition as read-write in order to modify it.
adb shell su -c 'mount -o rw,remount /system'
# Overwrite original services.jar on phone with services-mod.jar as root.
adb shell su -c 'cp /sdcard/services-mod.jar /system/framework/services.jar'
# Reboot phone for this to take effect.
adb reboot

When the phone reboots, it will say Android is upgrading and will rebuild Dalvik cache for each application, so it might take a while. But hey, it’s worth it.

Parting Words

The final product, if you just want to replace the services.jar on your phone, is here:

  • CyanogenMod 10.1 M1 - tested on Galaxy Nexus (maguro) and Nexus S (crespo), should work on other phones as well (can’t promise - let me know):

  • CyanogenMod 10.1 M2 - tested on Galaxy Nexus (maguro), should work on other phones as well (can’t promise - let me know):

  • CyanogenMod 10.1 M3 - tested on Galaxy Nexus (maguro), should work on other phones as well (can’t promise - let me know):

  • CyanogenMod 10.1 RC1 - tested on Galaxy Nexus (maguro), should work on other phones as well (can’t promise - let me know):

  • Stock 4.2.2 (takju/JDQ39) - for "takju" factory image from Google

  • Stock 4.2.2 (yakju/JDQ39) - for "yakju" factory image from Google

Apply the services-mod.jar to your phone like this (obviously assuming you have adb working and root):

# Copy services-mod.jar to /sdcard/ on phone.
adb push services-mod.jar /sdcard/
# Remount /system partition as read-write in order to modify it.
adb shell su -c 'mount -o rw,remount /system'
# Overwrite original services.jar on phone with services-mod.jar as root.
adb shell su -c 'cp /sdcard/services-mod.jar /system/framework/services.jar'
# Reboot phone for this to take effect.
adb reboot

Or, I suppose, you could just use ES File Explorer with root mode enabled, mount /system as read-write and copy the file to /system/framework/services.jar on the phone.

Note that a reboot is needed after you replace the services.jar whatever you do, and upon the first reboot the "Android is upgrading" dialog will pop up.

]]>