Screen Recording Permissions in Catalina are a Mess

A screen recording permission dialog in macOS Catalina

One of macOS Catalina’s multiple contributions to improving the security of the Mac is a change to the way that screen recording works. The system now requires that the user give applications permission to record their screen, similar to how applications must request webcam or microphone access.

I’m glad that this is being addressed. Allowing applications unrestricted access to the contents of your screen is both a privacy concern and a security risk, yet apps previously required neither the user’s approval nor their knowledge. The change in Catalina is a good idea… in theory. In practice, the implementation is a complete mess, and a source of headaches for developers.

Documentation

The first problem is documentation— or rather, lack thereof. As far as I can tell, the only official developer documentation for this security feature is this segment of the “Advances in macOS Security” WWDC panel. The WWDC video library is nice to have, but it’s not a suitable replacement for comprehensive documentation.

So what APIs are affected by this change? I know of three: CGDisplayStream, CGDisplayCreateImage, and CGWindowListCopyWindowInfo. All of them have different behavior, which further complicates things.

  • CGDisplayStream will trigger a permission dialog and returns nil from its initializer if the application doesn’t have screen capture permission.
  • CGDisplayCreateImage can also show a prompt but always returns an image. If permissions are disabled, the image only contains the desktop background, menu bar, and the calling application’s windows.
  • CGWindowListCopyWindowInfo won’t show a permission dialog. If the application doesn’t have screen recording permission, then the available window information will be limited (more on this later).

Are there more? Probably— but there isn’t an official list, as far as I can tell.

User Experience

Permission prompts are intrusive by default, so they should ideally be made as simple and clear as possible. Apple has recently focused on streamlining the permission flow on iOS. This same treatment hasn’t been applied to Catalina’s screen recording permission.

Other permission dialogs in macOS generally use the standard yes/no buttons. For screen recording, the user is presented with two options: “Deny” or “Open System Preferences.” To make this detour to System Preferences more tedious, the app then needs to be restarted before the permission change takes effect. I can give Apple the benefit of the doubt here and assume that this is a technical limitation that couldn’t be worked around without breaking things. What I can’t excuse is the lack of a usage string.

An automation permission dialog in macOS, containing a usage string

Most permission dialogs allow developers to specify additional text explaining why the application needs access to the resource (as described in Accessing Protected Resources). Apple usually considers usage strings to be so essential that omitting one can cause an application to crash. Yet for some reason, the screen recording permission dialog doesn’t allow a usage string. The inability for developers to simply communicate to their users what the application is trying to do is a glaring UX flaw, and can leave users choosing blindly.

How to Check Permission Status

If you are developing an application that uses the screen recording APIs, then you’ll likely want to know if the user has given your app approval. Catalina does not include such an official API. If you want to accomplish this, then you’re stuck with an unofficial workaround.

First, the relatively simple approach: just create a CGDisplayStream and check if the initializer fails. If the application hasn’t tried to record the user’s screen before, then the permission dialog will be displayed and the code returns false.

let stream = CGDisplayStream(display: CGMainDisplayID(),
                         outputWidth: 1,
                        outputHeight: 1,
                         pixelFormat: Int32(kCVPixelFormatType_32BGRA),
                          properties: nil,
                             handler: { _, _, _, _ in })

return stream != nil

Keep in mind that the screen capture permission dialog doesn’t block the calling thread. If your application shows an alert when the permission check fails, then the user will see two dialog boxes the first time.

If you want to check for screen recording permission passively (without showing a permission dialog) then it gets more complicated. I did some research and found this question on StackOverflow which had some helpful information. Eventually, I managed to develop a heuristic which seems to work reliably.

guard let windowList = CGWindowListCopyWindowInfo(.excludeDesktopElements, kCGNullWindowID)
	as NSArray? else { return false }

for case let windowInfo as NSDictionary in windowList {
	// Ignore windows owned by this application
	let windowPID = windowInfo[kCGWindowOwnerPID] as? pid_t
	if windowPID == NSRunningApplication.current.processIdentifier {
		continue
	}
	
	// Ignore system UI elements
	if windowInfo[kCGWindowOwnerName] as? String == "Window Server" {
		continue
	}
	
	if windowInfo[kCGWindowName] != nil {
		return true
	}
}

return false

The key to this approach is CGWindowListCopyWindowInfo, because it won’t cause the system to show a permission dialog. The function returns an array of dictionaries, with each dictionary representing a window. If the application hasn’t been authorized for screen capture, then the kCGWindowName key won’t be present and kCGWindowSharingState will be zero (this is briefly discussed in the previously mentioned WWDC video). There are exceptions that need to be accounted for: applications can always access their own windows, as well as certain system UI elements like the dock.

And now I have to ask… why? Why do developers have to rely on flimsy heuristics that could break in a future version? Why isn’t there an official function in the API for this? Why do we have to rely on mildly obscure StackOverflow questions just to figure this stuff out?

I do genuinely appreciate the additional security. But as it exists now, the implementation is a mess.