How To Fight Things in Three Dimensions: Zelda’s Z-Targeting

Can’t believe I haven’t done one of these on this topic yet. The Legend of Zelda: Ocarina of time is rightfully revered for how it set the tone for what action and adventure games could be in what was the relatively new frontier of polygonal 3D games in 1998. Moving to 3D comes with a whole host of problems, though, especially when it comes to active combat. Our real three dimensional space is very complicated, and abstracting that to a computer program can have some disorienting results if not done with care. One of Zelda‘s most notable contributions to the craft, I think is the Z-Targeting system. “Z-Targeting” is the name for Ocarina of Time‘s 3D targeting system which would let the player focus the game camera’s attention on a single point of interest by tapping the “Z” button. It gets plenty of mention, but honestly I feel like sometimes this one innovation doesn’t get praised enough. It kind of set the standard for how real-time gameplay involving two moving bodies works even to this day. There are also a lot of little things that helped this first iteration of a 3D targeting system work remarkably well, despite its age.

Child Link (The Legend of Zelda: Ocarina of Time) strafes to the left and right while a targeting crosshair focuses on a rock in a grassy forest. The rock remains center-camera, while Link shifts to either side of the camera.
The rock was very patient with me during the filming of this clip.

Notice in the image above how the camera smoothly and automatically situates Link to one side. You may have heard of the rule of thirds, an stylistic concept in art for generating compelling composition. By dividing an image into thirds and placing the subject of your art into the first or last of those thirds helps emphasize their importance, and draws the eye. It also frames the remaining, more open two thirds as a point of interest to the subject, a place they might be looking or going. Link is the subject in this scenario, and the camera essentially enforces the rules of thirds while Z-target is active. It’s not only very aesthetically pleasing, and helps draw the player into the drama of a good sword fight, but it’s very functional. But ensuring Link and the target occupy opposite ends of the screen, then it becomes very rare that Link himself will obscure his target from the player sitting on their couch. In this way essential information conveyed by your target, like an incoming attack, isn’t accidentally hidden from the player. This diagonal framing also helps keep the spacial relationship between Link and his target clear and unambiguous, which as I’ve mentioned elsewhere, is essential to satisfying combat.

This mechanic of making Link’s position relative to his target unambiguous is very strictly upheld. The camera will eagerly clip into walls to ensure the target remains properly framed, but this isn’t a problem as obscuring geometry will often not be rendered, so the camera’s over-commitment to framing is actually an advantage. It’s very intuitive. In an interview with the game’s general director, Toru Osawa, it was said that the system was inspired by a ninja and samurai themed performance. A ninja attacked with a sickle on a chain which was caught by the samurai. The ninja moved in a circle around his opponent as the chain connecting them was pulled tight. It seems drawing an invisible and unbreakable line between two entities helped the developers visualize how this new system would work. Link will always circle around his target in-game, and inputs on the controller are changed during a Z-target to reference the subject of the target. Moving Link “Left” means he will move clockwise around his target. “Right” means he will move counterclockwise around his target. It is as if Link is moving on a 2D plane, but bent and wrapped around the target. This abstraction expands into a rather robust system.

Child Link (The Legend of Zelda: Ocarina of Time) sidles up to a wall while a targeting crosshair focuses on a giant spider. The camera moved through the nearby wall, but the wall fades from view as this happens, allowing Link to see the spider's underside, which he shoots with his slingshot.
With the spider conveniently framed by the camera, even through this wall, Link is able to sneak a shot in to hit its vulnerable underside.

Another thing I noticed while playing Ocarina of Time recently is how movement during Z-targeting relates to the input of the gamepad controller. I’ll give you an example. While a Z-target is active, Link can do a quick side-step or back flip to avoid enemies. Holding the control stick back, toward yourself, when you press the action button will initiate a back flip. Holding the control stick to the left or right will initiate a side-step when the action button is pressed. So it seems the game is tracking Link’s relative facing direction to the camera for the purposes of his evasive jumps. If Link is facing perpendicular to the camera, or in other words, if his shoulder line forms a right angle with the plane of the game screen, then a “right” or “left” input on the control stick is considered “back” for the purposes of evasion. You can see this illustrated below:

Child Link (The Legend of Zelda: Ocarina of Time) hops to the side repeatedly in a naturalistic wooden interior. A targeting crosshair focuses on a giant cyclopic bug. When Link is almost side-on to the camera, he does a back flip.
During this clip, I am holding only the “right” direction, but Link eventually back flips anyway.

In this above clip, I am holding “right” on the control stick throughout. Once Link’s angle to the camera becomes too extreme, he no longer side-steps, and instead back flips. However, Link’s stride never changes. “Right” on the control stick is always considered to be Link’s right, relative to his current standing position, for the purposes of calculating what direction Link should be running. I can imagine a couple of reasons this might be. Changing Link’s continuous move direction on a dime would be very disorienting for the player. Link’s stride is not really changing in the previous clip, only the player’s angle of observation, so it’s unintuitive to think that a change in input is required to keep that stride in any scenario. The evasive jumps, however, are discreet units of movement and thus are not jarring when their operation changes based on camera position. Further, if Link were to side-step while side-on to the camera, it would be difficult to tell if he had done much of anything. By changing it to a back flip, the feedback of Link making an evasive move is maintained.

The Legend of Zelda: Ocarina of Time, being the first 3D Zelda game, obviously utilizes its verticality in ways that previous Zelda games could not. Zelda is a series well known for an arsenal of unique weapons and tools for solving puzzles and dispatching enemies. Iconic tools like the boomerang and hero’s bow are very compelling. It would have been a drastic admission of defeat to not translate such things into the first 3D Zelda. They have some hefty inherent problems, though. Control sticks are, frankly, not best suited for precision pinpoint aiming compared to a computer mouse, a gyroscope, or a photonic motion sensor. Ocarina of Time still offers the option of manually aiming projectiles through a first-person perspective, which is convenient for solving puzzles, but not ideal for most combat encounters. The Z-Targeting system rather elegantly solved this problem as well. The drawback is that the player doesn’t do much aiming at all when utilizing their bow and arrow in combat, which could be argued as part of the skill set of playing the old Zelda games, but in trade Ocarina gets the advantage of keeping airborne enemies in focus and keeping the use of projectiles in combat practical. Zelda combat is typically more about understanding the best tool for the job than skillful execution anyway, so I think it was a savvy decision to enable ranged combat in this way.

Child Link (The Legend of Zelda: Ocarina of Time) shoots a strange giant egg off of a ceiling, in a naturalistic  wooden interior, using his slingshot. He then stabs a nearby giant bug with his sword, then shoots it as it runs away.
There was an intent focus in this game on making your tools practical and functional, even if they’re not always the most complex or involved.

So many modern games utilize an automated camera or targeting system that can be traced directly back to Z-targeting, so I felt it deserved its own appreciation post here. The mechanic is unintrusive, fit-for-purpose, artistically sound, and practically seamless. It even has its own little diegetic explanation of your partner fairy, Navi acting as the source of your target’s focus. You might notice her dancing around targeted enemies in the clips I’ve provided. Helps reinforce her as an important partner to Link, even in spite of her infamous chattiness. Honestly, after looking into it, there are some features that even some modern targeting systems don’t do as well as Ocarina of Time. There have been perhaps more elegant, more robust, and even more interesting targeting systems since, but it’s absolutely astounding how much Zelda nailed it on its first try, and set the stage for the iteration of 3D navigation for many years to come.

Child Link (The Legend of Zelda: Ocarina of Time) pursues an elephant-sized cyclopic insect as it climbs up a wall in a dark cavern. Link aims up at it with a targeting crosshair focused on it, then shoots its eye with his slingshot.

Time passes, people move. Like a river’s flow, it never ends…