[雙語譯文]從REST到GraphQL
原文地址:
From REST to GraphQL jacobwgillespie.com
譯者注:本文釋出於 2015年10月9日 ,雖然時間較早並且其中所描述的缺點現如今也已經有了解決方案,但管窺歷史也能瞭解到GraphQL的一些發展,故作此譯。
Disclaimer: GraphQL is still new and best practices are still emerging. This post describes some of my journey with implementing a GraphQL backend service, so it is a snapshot of what I've learned so far, presented in the hopes that it will be useful to others. Also, some of the specific real-world implementation details internal to Playlist have been paraphrased / simplified / anonymized for obvious reasons.
宣告:GraphQL剛剛初出茅廬,它的最佳實踐正如雨後春筍般湧現。本文主要描述了我在實現GraphQL後端服務的一些經驗,順便記錄一下我迄今為止所學到的內容,以饗大家。當然了,由於眾所周知的原因,文章中播放列表(Playlist)的細節做了一些脫密處理。
This post assumes a basic familiarity with GraphQL. If you are not already familiar with GraphQL:
本文假設您對GraphQL有基本的瞭解。如果您還不熟悉的話,請點選
REST
At Playlist, we have a Rails / REST-based API that powers the app. When it was created initially, we used Github's V3 API as an inspiration and generally modeled our API structure after theirs.
在我們的 App 播放列表(Playlist)頁面,我們構建了一個基於 Rail/REST 的介面。在最初建立它的時候,我們參考了 Github 的V3 API來構建自己的API架構。
Need track information?
要檢視歌曲(track)的資訊?
GET /tracks/ID
Need to fetch a playlist?
獲取播放列表?
GET /playlists/ID
Need a playlist's tracks?
檢視播放列表的歌曲(track)?
GET /playlists/ID/tracks
It had the benefit of simplicity - endpoints are intuitively named and can be browsed easily. Initially we even implemented URL properties on all objects so the API was browsable just by clicking (this was eventually removed in favor of smaller response payloads). Documentation described what was returned by each endpoint so our mobile team could easily integrate.
介面名稱顯而易見,瀏覽毫無壓力。最開始我們甚至在所有物件上都部署了URL屬性,只要點選就能瀏覽 API (當然後來我們為了減少負載最終還是移除了這個功能)。說明文件把每個介面的返回內容都清晰地列了出來,這樣我們的團隊整合起來也是輕鬆愉快。
介面資料的臃腫和拖慢(Bloat and Slowdowns)
However, as time passed, payloads got larger as requirements grew. As an example, here is a simplistic playlist object response:
好景不長,隨著時間的推移和需求的增長,發回來的資料越來越多,下面展示了一個簡單的播放列表介面的返回資料。
{ "created_at": "2015-08-30T00:50:25.000+00:00", "id": "e66637db-13f9-4056-abef-f731f8b1a3c7", "like_count": 3, "liked_count": 3, "name": "Excuse me while I kiss these frets", "owner": { "avatar_url": "https://secure.gravatar.com/avatar/4ede0ad35bb796ea8f78861acc4372ca?s=300", "bio": null, "id": "b06e671a-b169-45e6-a645-74c31abca910", "login": "playlistrock", "name": "Playlist Rock", "site_admin": false }, "published": false, "saved_count": 3, "track_count": 50, "updated_at": "2015-09-30T06:11:49.000+00:00" }
It contains all the basic information about the playlist, but (almost) none of the associated objects. As a client, you would be expected to call other endpoints like `/playlist/ID/tracks` to fetch sub-resources.
它涵蓋了播放列表(Playlist)的所有基本資訊,但(幾乎)沒有關聯物件。在客戶端方面需要調取其他介面(如`/playlist/ID/tracks`)來獲取進一步的資源。
As more associations were added, more data kept getting stuffed into the playlist response. Specifically, because we used Rails and ActionView partials, more data was added to the _playlist.json.jbuilder
partial as lists of playlists needed more and more data.
隨著資料間的關聯越來越複雜,播放列表(Playlist)介面的返回資料也越來越多。具體來說,因為我們使用了 Rails 和 ActionView 部分,所以成噸的資料被放到了 _playlist.json.jbuilder
中處理,以應對播放列表(Playlist)需要越來越多資料的麻煩。
Mobile requirements would state something like "we need to show the first three tags for each user playlist when displaying a user's profile," so rather than call /users/USERNAME/playlists
, then have to make an HTTP request to /playlists/ID/tags
once for each returned playlist, the tags got added to the playlist partial.
但是這還沒完,移動端的需求是"我們需要在使用者的個人資料頁中顯示每個使用者播放列表(Playlist)的前三個標籤",這樣一來,我們就不能呼叫 /users/USERNAME/playlists
了。 相反,我們要呼叫 /playlist/ID/tags
對每個新增到播放列表(Playlist)的標籤欄位(tags)再各發送一次HTTP請求去獲取播放列表(Playlist)。
{ "created_at": "2015-08-30T00:50:25.000+00:00", "genres": [], "id": "e66637db-13f9-4056-abef-f731f8b1a3c7", "like_count": 3, "liked_count": 3, "name": "Excuse me while I kiss these frets", "owner": { "avatar_url": "https://secure.gravatar.com/avatar/4ede0ad35bb796ea8f78861acc4372ca?s=300", "bio": null, "id": "b06e671a-b169-45e6-a645-74c31abca910", "login": "playlistrock", "name": "Playlist Rock", "site_admin": false }, "published": false, "saved_count": 3, "tags": [ { "name": "Jimi Hendrix" }, { "name": "Jimmy Page" }, { "name": "Eric Clapton" }, { "name": "Slash" }, { "name": "Stevie Ray Vaughan" } ], "track_count": 50, "updated_at": "2015-09-30T06:11:49.000+00:00" }
Eventually, we got to something like the following for a /playlists/ID
response:
最終,我們從 /playlists/ID
介面獲取的返回值就像下面這樣:
{ "collaborators": [], "created_at": "2015-08-30T00:50:25.000+00:00", "genres": [], "id": "e66637db-13f9-4056-abef-f731f8b1a3c7", "like_count": 3, "liked": true, "liked_count": 3, "name": "Excuse me while I kiss these frets", "owner": { "avatar_url": "https://secure.gravatar.com/avatar/4ede0ad35bb796ea8f78861acc4372ca?s=300", "bio": null, "id": "b06e671a-b169-45e6-a645-74c31abca910", "login": "playlistrock", "name": "Playlist Rock", "site_admin": false }, "published": false, "saved": true, "saved_count": 3, "tags": [ { "name": "Jimi Hendrix" }, { "name": "Jimmy Page" }, { "name": "Eric Clapton" }, { "name": "Slash" }, { "name": "Stevie Ray Vaughan" } ], "track_count": 50, "tracks": [ { "album": { "id": "8d8223c6-284c-4aac-92bd-b31debca3237", "title": "Toys In The Attic" }, "artists": [ { "id": "6c29ff27-ad20-4448-9961-f6617e393539", "name": "Aerosmith" } ], "explicit": false, "have_liked": false, "id": "a1f9f37a-2a15-407d-82f8-e742ab5e3b81", "title": "Walk This Way" }, { "album": { "id": "21a9f63b-a38f-40f1-aaf1-8b7ed3ad1a92", "title": "Audioslave" }, "artists": [ { "id": "7d600588-d073-41e9-a4f7-434501b16c45", "name": "Audioslave" } ], "explicit": false, "have_liked": false, "id": "4cc1fc43-61e8-49a7-be42-9d7ad35c1284", "title": "Like A Stone" } ], "updated_at": "2015-09-30T06:11:49.000+00:00" }
Here we're embedding tracks and even a subset of their associations, with enough data to cover all the possible places an individual playlist could appear. And this data was returned for every place the playlist appeared.
這裡,我們把進一步獲取的資訊嵌入到了之前的資料甚至是其相關聯的子資料當中,並提供足夠的資料來覆蓋單個播放列表(Playlist)中所有可能的位置,再把資料返回到每個播放列表出現的地方。
This was a conscious design decision to augment responses rather than add more endpoints - we could have done something like /playlists/ID/forProfile
, /playlists/ID/forNotifications
, etc.
我們選擇增加返回的資料項而非新增更多埠,這是在設計時有意為之的 —— 誠然,我們確實可以再寫幾個諸如 /playlists/ID/forProfile
、 /playlists/ID/forNotifications
之類的介面。

There is something to be said for the simplicity that provides. To add a field to a track, for example, you locate the _track.json.jbuilder
partial and add the additional field. However as views grew, performance quickly became an issue in two distinct ways.
其實實現起來也並不困難。要給歌曲(track)資料加個欄位,找到 _track.json.jbuilder
部分添上就行了。但是吧,隨著頁面越來越多,頁面效能問題很快就會以兩種不同的方式給你喂屎。
First , response payloads were large, to the point that the mobile app sometimes struggled with the amount of effort it took to parse, deserialize, and store the JSON. Response times were longer, caches were larger, and every change to a small partial expanded to a much larger change all over the app.
首先 ,返回的資料量很大,以至於移動應用會在解析、反序列化和儲存JSON上花費太多時間。返回資料的時間更長了,快取也更大了,這樣一來每個小的更改都產生了“牽一髮而動全身”的效應。
Second , query performance took a hit as more and more data (especially relationships) were fetched for each request. In development with caching disabled, a single request for a playlist can request upwards of 170 database queries to pull all the relevant information.
其次 , 由於每個請求所取回的資料越來越多(特別是關係型資料),效能也受到了衝擊。在不使用快取的開發環境中,對一個播放列表(Playlist)的單次請求甚至包括了170個數據庫查詢欄位來拉取相關資訊。
In production, we made heavy use of Rails "Russian Doll" style caching, so for a fully cached playlist there is only one database query involved. Still, on that first load it had to execute those 170 queries to build the full response (usually fewer thanks to Russian doll caching and shared sub-resources).
在生產環境下,我們大量的使用了Rails的“套娃”式快取,因此對於完全快取的播放列表(Playlist),使用一個數據庫查詢欄位足矣。不過,第一次載入的時候,它仍要執行這170個查詢來構成完整的響應資料(應該說,多虧了套娃快取和共享副資源,我們所請求的資料通常沒那麼多)。
套娃式快取 edgeguides.rubyonrails.org What pushed us over the edge was the have_liked
field above. This was a boolean field indicating whether or not the currently authenticated user had liked the track. Product requirements stated that this field had to be accessible on the playlist detail view, and thus had to be included in the playlist response for each track.
促使我們跨越障壁的是上面的 have_liked
欄位。這是一個布林值欄位,表示的是當前已通過身份驗證的使用者是否喜歡該歌曲(track)。我們可愛的產品經理要求必須在播放列表(Playlist)的詳細檢視中訪問該欄位,因此它就只能包含在每個歌曲的播放列表(Playlist)返回資料當中。
This broke the Russian doll caching.
這一下就把“套娃”給摔得稀碎。
The _track.json.jbuilder
partial became a combination of a cached portion containing "static" information about the tracks and an uncached portion containing the call to current_user.have_liked?(track)
. Subsequently, _playlist.json.jbuilder
and every view that referenced the track partial transformed similarly to contain a cached portion and an uncached portion.
_track.json.jbuilder
成為了包含歌曲的“靜態”資訊和呼叫 current_user.have_liked?(track)
非快取部分資訊的集合體。隨之而來的是, _playlist.json.jbuilder
和每個引用了該歌曲部分的檢視都包含了快取和非快取兩個部分。
Worse still, for a playlist request with 50 tracks, 50 calls to have_liked?
were executed (N+1 query bug).
更為操蛋的是,如果一個播放列表(Playlist)中有50首歌,那就要呼叫50次 have_liked?
介面,會產生N + 1 查詢Bug。
We had several different possible solutions, including separate sub-resource view files for separate endpoints, custom query cache management to reduce the number of additional queries, etc. However, we wanted a solution that addressed both issues and allowed for greater control.
我們有多種不同的解決方案,包括單獨的副資源檢視檔案、用於減少其他查詢數量的自定義查詢快取管理等等。但是,我們需要一種能夠解決這兩個問題並且更容易控制的解決方案。
GraphQL
Enter GraphQL. Using GraphQL to power our backend, we were able to provide the mobile client exactly what it needed for each request, with no additional bloat, and were able to optimize the database and cache layer to do everything in an extremely performant way.
於是我們就嘗試著使用了GraphQL,用它來幫助我們的後端,以此便可以精確地為移動端提供其所需的資料,沒有任何冗餘。此外還可對資料庫和快取層進行優化,從而以極其高效的方式完成所有工作。
Before getting into some of the specific details, here are a few common questions / misconceptions I often encountered or experienced myself while learning about GraphQL:
在深入瞭解一些具體細節之前,先行列出一些我在學習GraphQL時經常遇到或者經歷過的一些常見問題/誤解:
常見問題/誤解
GraphQL sounds like graph. Does my data need to be a "graph" or do I need a "graph" database? Does it work with relational databases?
GraphQL聽起來像圖(graph)啊。那我的資料還得是個圖是麼?或者說我需要整一個“圖”資料庫?這個適用於關係型資料庫嗎?
No, you do not need a graph database, it works just fine with whatever database you have.
別介,不勞您費事,它適用於您所擁有的任何資料庫。
While IMO you can think about almost any "relational" database in terms of a "graph" - something like:
雖然我的建議(IMO,in my opinion)是您可以用“圖形”來考慮幾乎所有的“關係型”資料庫 —— 就像下面這樣:
user --- OWNS --- playlist || LIKESCONTAINS || v| track <---------------┘
GraphQL describes and fetches data like a tree:
GraphQL以類似樹形圖的形式描述並取回資料:
user ┖-OWNS-> playlist ┖-CONTAINS-> track ┖-LIKED_BY-> users
You can use a graph database, a relational database, an in-memory array, a key-value store, whatever. At Playlist, we use Neo4j as a "primary" database, operating in full graph mode, and Redis, acting as a cache layer utilizing various different data structures including hashes, key-value pairs, and sets. Redis essentially represents the data in Neo as key-value stores by ID and ZSETs for associations by type, closely mirroring Facebook's TAO model:
您可以使用圖形資料庫、關係型資料庫、記憶體陣列、鍵值儲存等等。在播放列表(Playlist)頁中,我們選擇了Neo4j作為“主”資料庫,並以完全圖形模式執行。同時,我們用Redis作為快取層並使用了包括雜湊、鍵值對和集合在內的資料結構。Redis本質上是將Neo中的資料按型別關聯,通過ID和ZSET的鍵值對形式儲存,這和Facebook的TAO模型有異曲同工之妙。
TAO模型 www.facebook.comThis allows us to have an authoritative data source in Neo with the full power of Cypher queries but the performance of an in-memory key-value store for 90% of all queries.
這就讓我們能夠獲得Neo中的權威資料,它具有Cypher查詢的全部功能。當然實際上Redis的效能已經能夠滿足90%的查詢情況了。
GraphQL sounds like "query language" which sounds like I'm exposing the ability to query my database on the client. This sounds dangerous. What about malicious clients?
GraphQL聽起來像“查詢語言”啊,那是不把查詢資料的功能是暴露在客戶端上了啊。有點方啊,要是有人搞事情可怎麼辦?
No, you're not exposing your database queries to the client any more than you were with your REST API. Okay, maybe a bit.
不能夠。資料庫查詢暴露給客戶端是不存在的,跟REST API是一樣的。彳亍口巴,有一點兒。
GraphQL is more or less a DSL on top of your own backend data fetching logic. It does not connect directly to a database. In fact, the schema you expose over GraphQL will likely not mirror your database exactly. It provides a way to describe a request for structured data, but it is then up to your backend to fulfill that request.
GraphQL或多或少是基於您自己的後端資料獲取邏輯的DSL。它並不直接連線到資料庫。實際上,通過GraphQL公開的模式可能並不會完全地映象您的資料庫。它提供的是一種描述結構化資料請求的方法,實際上完成請求的還是後端。
One concern is that GraphQL supports "nested" fetching, so should a malicious client request a particular recursive nested relationship an arbitrary but large number of times (like user.followers.followers...), there could be a potential performance hit on the backend. See the final section for a few ideas on how to mitigate this risk.
有一個問題是GraphQL是支援取回“巢狀”資料的,因此如果有別有用心之人大量且遞迴地請求特定的巢狀資料(如 user.followers.followers
),那確實會對後端有潛在的效能影響。最後一節會介紹一些如何降低此風險的方法。
So, GraphQL doesn't provide unauthenticated access to my database?
那麼問題來了,GraphQL難道可以不經驗證就訪問我的資料庫麼?
No. Authentication is most likely handled outside of GraphQL entirely and your backend is still responsible for handling data fetching / authorization in a secure way, just like how you were doing before with REST.
不存在的。身份證驗證最好是在GraphQL之外進行處理的,您的後端仍然負責以安全的方式處理資料/授權,和用REST別無二致。
For our new GraphQL backend, we perform authentication outside of GraphQL entirely, passing it as a request header and having the server authenticate the request and then pass the authentication context down to the GraphQL data resolvers.
對於我們新的GraphQL後端,我們完全在GraphQL之外執行身份驗證,把驗證資訊作為請求頭傳遞,讓伺服器對請求進行身份驗證,然後將身份驗證的上下文傳遞給GraphQL資料解析器。
At Playlist, we would eventually like to make our GraphQL backend "transport-agnostic", so we could grab data over HTTP like normal or request data via a non-HTTP wire protocol. It would even be cool to implement some kind of live streaming updates for real-time data changes over something like MQTT. As such, we've considered embedding authentication information, either authentication tokens or username/password pairs, in the GraphQL requests themselves, but as of yet we have not fully explored those paths.
在播放列表頁面,我們最終希望使我們的GraphQL後端“和傳輸無關”,因此我們可以像往常一樣通過HTTP或其他協議獲取資料。我們如果實現了類似MQTT(Message Queuing Telemetry Transport,訊息佇列遙測傳輸協議)的實時流式資料更新,那可就太酷了。因此,我們已經考慮在GraphQL請求中加入token或使用者名稱/密碼對的授權資訊,不過到目前為止還處於計劃階段。
What about security?
那麼安全性怎麼樣呢
Again, this is completely up to your backend and it not a primary concern of GraphQL. We will see an authenticated resolver below (the function that fetches and returns data). There seem to be two predominant approaches to handling a client attempting to access something they are not authorized to view.
一樣的,這個完全取決於您的後端,它並非GraphQL的主要關注點。下文我們會演示一個待驗證的解析器(獲取和返回資料的函式)。看起來主要有兩種方法來應對試圖訪問無權檢視內容的客戶端。
First , return null for the requested field. This seems to work well in cases where there is no real harm in asking for a particular set of data and no real harm in denying it.
首先 ,我們可以為請求的欄位返回null。這可以用於無害請求一組特定資料或請求資料被拒絕的情況。
A good example would be asking for the email of a user where the backend only provides the user's email to that user themselves. If I request my own user object with the email field included, I'll get my email. If I request another user object, the email will be null, and I can code my application to be okay with that null.
一個很好的例子是要求使用者提供提供電子郵箱,這樣後端只需將使用者提供的郵箱發給他即可。如果我請求的使用者資料物件中帶有郵箱欄位,那麼就會收到郵件。如果我請求的是另一個使用者的資料,那麼郵箱欄位的值就是 null
,這樣在應用裡也很好處理了。
Second , return an actual error. This seems to work best if the client asking for the data needs to know why it was not provided the requested data so that it can take action on that information.
第二 ,返回一個報錯。如果請求資料的客戶端需要知道為什麼沒有提供所請求的資料,從而根據原因來採取對應的行動,那麼這麼做最有效。
A good example would be attempting to access an object that requires authentication, but no authentication was provided.
舉例來說,使用者嘗試訪問一個需經過身份驗證的資料,但卻並未經過驗證,這時就可以嘗試返回錯誤。
"404s" are usually returned as nulls. As per convention (like on Github's API), unauthorized objects are sometimes returned as null as well, like in the case of asking for a user's profile when that user has blocked the currently authenticated user. A null mimics a 404 and does not leak the fact that the hidden user exists.
大家耳熟能詳的返回值就是“404”了。按照慣例(如Github API),未授權的物件有時也會返回null。這就和使用者去請求一個已經被遮蔽的已授權使用者資料是一樣的。null模仿了404並且沒有洩露隱藏使用者存在的事實。
The Github repositories are confusing! Which one is really GraphQL?
Github倉庫裡成噸的GraphQL庫!哪個是真的啊?嚶嚶嚶
graphql/graphql-spec is the specification for the GraphQL language and its implementation - it is not tied to any specific language / backend. It's great to read to fully understand the language, especially if you're into those things or learn best by digging into concepts and theories.
graphql/graphql-spec 是官方的GraphQL語言及其實現規範——它並不依賴於任何特定的語言/後端。如果您能夠通過深入研究概念和理論來學習這些那就再好不過了。
graphql/graphql-j is a reference implementation of that specification provided by Facebook, written in JS/Node. This is the place to start if you'd like to use GraphQL with a Node-based backend or just want to play around. To the best of my knowledge, this is the most complete implementation of the specification, being more or less the official reference implementation. Read the README.
graphql/graphql-js 是由Facebook提供的該規範的參考實現,採用JS/Node進行編寫。如果您想將GraphQL和基於Node的後端搭配起來,或是僅僅玩玩看看的話,那麼可以從它開始。據我所知,這是對規範最完整的實現,或多或少是官方的參考實現。瀏覽README可熟知一切。
graphql/express-graphql is a middleware for Express.js to easily create a GraphQL server with Express. I'd highly recommend reading the entire source code as it's not terribly long, is quite easy to understand, and lends itself to explaining how to use graphql-js, even if you don't end up using express-graphql directly.
graphql/express-graphql 是Express的中介軟體,可以使用Express來輕鬆地建立GraphQL伺服器。我強烈地建議您閱讀它的原始碼,因為它並不是非常長,很容易理解。即使您最終沒有直接使用它,您也可以熟知如何使用graphql-js。
( graphql/graphql-relay-js ) is a set of helpers to implement Relay-compatible IDs and "connections" (one to many associations, or array fields) - it is not required to use GraphQL, however we have found that being Relay-compatible has benefited us even though we're not using Relay, with ID handling, pagination, etc. For more information on the Relay GraphQL specification, see theRelay Docs.
graphql/graphql-relay-js 是用於實現與Relay相容的ID和“連線”(一對多關聯或陣列欄位)的助手。它並不需要使用GraphQL,但我們在使用中發現即使我們沒有使用Relay,在ID處理和分頁等情況下,Relay相容結構也讓我們很受用。更多資訊請訪問官方文件.
graphql/graphiql is a web-based IDE for GraphQL. This thing is freaking awesome. GraphQL provides schema introspection, and GraphiQL provides autocomplete and syntax validation using those introspection capabilities. You can download this project directly, embed it in your app, or my favorite, download it as a standalone app in an Electon-based wrapper at skevy/graphiql-app .
graphql/graphiql 是GraphQL的Web IDE。驚了。GraphQL提供了schema自檢功能,依靠它可以實現自動完成和語法驗證。您可以直接下載該專案,將其嵌入您的應用當中。您也可以將其以獨立應用的形式應用到基於Electron的( skevy/graphiql-app )中
graphql/dataloader is a utility module that has revolutionized data fetching in our Playlist backend. Its foundation is extremely simple - it collects the arguments of calls to load() while in the current frame of execution (an event loop tick) and then uses your custom provided logic to batch-fetch data based on the collected arguments. More on how we use DataLoader below.
graphql/dataloader 是一個工具模組,它徹底底改變了我們從播放列表(Playlist)後端介面取回資料的方式。它把當前執行架構(事件迴圈tick)中的所有呼叫引數都集中到了`load()`方法內,然後根據您的邏輯來批量獲取資料。下文會有進一步的介紹。
graphql/swapi-graphql is an example project exposing the existing SWAPI as a GraphQL server. It utilizes graphql-js, express-graphql, GraphiQL, and DataLoader.
graphql/swapi-graphql 是將現有的SWAPI重構為GraphQL伺服器的示例專案。它使用了graphql-js、express-graphql、GraphiQL和DataLoader。
chentsulin/awesome-graphql is an awesome collection of links to GraphQL resources, projects, posts, and more. Check it out!
chentsulin/awesome-graphql 是一個很棒的GraphQL資源、專案、帖子等的連結集合。
What is Relay? Do I need Relay too?
Relay到底是個啥?我也需要這個?
facebook/relay is a framework for connecting GraphQL and React in an intelligent way. You absolutely do not need Relay to take advantage of GraphQL, though if you're using React, check it out - it may be useful in your app.
Relay是一個以智慧方式連線GraphQL和React的框架。其實你並不需要Relay就可以使用GraphQL,但如果你用React,那就來看看它吧——它會助你一臂之力。
Relay requires a few special conventions in your GraphQL query design to support its operation, and at Playlist we've decided to be Relay-compatible, even though we do not use Relay itself. This has provided a consistent API for fetching by ID and representing and paginating collections of associations. The Relay documentation has more information.
Relay需要你在GraphQL查詢設計當中為它做一些特殊約定。因此即便我們不使用Relay本身,我們也決定把Playlist介面改成Relay相容形。
Is GraphQL only for React?
不是吧,GraphQL僅僅支援React?
Nope. You can use it anyplace you used HTTP/REST previously
否認三連.jpg。您可以在以前使用 HTTP/REST 的任何地方使用它。
Playlists and Tracks in GraphQL
我們來看看用GraphQL重寫的Playlist和Tracks介面
Let's delve into how we can solve the performance issues from the above playlist endpoint with GraphQL. We want to only return the data that is needed, and optimize our database queries so that we can avoid the N+1 bug.
讓我們深入研究一下如何使用GraphQL解決上述Playlist介面的效能問題。我們 只返回所需的資料 ,同時優化我們的資料庫查詢,從而使我們可以避免N + 1錯誤。
Our GraphQL query will look like the following:
我們的GraphQL查詢操作如下:
query FetchPlaylist { playlist(id: "e66637db-13f9-4056-abef-f731f8b1a3c7") { id name tracks { id title viewerHasLiked } } }
Which then returns exactly the data requested, in the structure defined by the GraphQL query:
然後,在GraphQL查詢定義的結構中返回所請求的資料:
{ "playlist": { "id": "e66637db-13f9-4056-abef-f731f8b1a3c7", "name": "Excuse me while I kiss these frets", "tracks": [ { "id": "a1f9f37a-2a15-407d-82f8-e742ab5e3b81", "title": "Walk This Way", "viewerHasLiked": true }, { "id": "4cc1fc43-61e8-49a7-be42-9d7ad35c1284", "title": "Like A Stone", "viewerHasLiked": false } ] } }
For simplicity, the playlist ID was embedded in the query, though in practice we'd be passing the ID as a typed parameter rather than embedding it inside the query. See the GraphQL docs for more info.
為了簡單起見,我們把播放列表ID嵌入到查詢當中,但實際上我們將ID作為型別引數傳遞,而非嵌入查詢中。有關GraphQL的INPUT型別,請參閱GraphQL文件。
We assume that authentication has taken place outside of GraphQL and the authentication state has been provided in the rootValue object of the GraphQL call so that our resolvers can access. See the docs for graphql-js and express-graphql for more information about rootValue, and see below for it in action.
我們假設授權驗證是在GraphQL之外進行的,並且身份驗證狀態已經在GraphQL呼叫的 rootValue
物件中提供,以便我們的解析器(resolver)可以訪問。有關 rootValue
的更多資訊,請參閱graphql-js和express-graphql的文件,以及下面的操作。
First, we have to define a root query object, which is the entry point for the query. The root query object should have a field called playlist
, since that's what we're providing in the query above:
首先,我們 必須 定義一個根查詢物件,它是查詢的入口點。根查詢物件應該有一個名為`playlist`的欄位,因為這是我們在上面的查詢中提供的內容:
import {GraphQLObjectType, GraphQLNonNull, GraphQLString} from 'graphql' import playlistType from './playlistType' export default new GraphQLObjectType({ name: 'Query', description: 'The root query object', fields: () => ({ playlist: { type: playlistType, args: { id: { type: new GraphQLNonNull(GraphQLString), }, }, resolve: ( _, {id}, { rootValue: { ctx: {backend}, }, }, ) => backend.getModel('Playlist').load(id), }, }), })
Note that we're using ES6 syntax here. We use babel with stage set to 0 to take advantage of all the latest and greatest ES7 stuff.
注意,我們在這裡使用了ES6語法。我們把babel的編譯階段設定為0,以此來利用最新和最棒的ES7內容。
We define a field that returns a playlist type (a GraphQL type definition that we define in another file and import here), set up a single argument named id
of type non-null string, and then most importantly we define a function to "resolve" the object.
我們定義一個返回playlistType的欄位(該欄位是在另一個檔案中定義並在這裡匯入的GraphQL型別定義),設定一個名為 id
的引數,然後重點來了,我們要頂一個函式來“解決”(resolve)它。
The first argument to resolve is the current object itself (since we're at the root level, we ignore this argument). The second argument is the args passed to the GraphQL call, so we extract out the id
field. The third argument provides us access to the GraphQL context, so we extract out our backend instance that we passed down from the rootValue
elsewhere in the app and use it to fetch a playlist by ID.
要解決的第一個引數是當前物件的本身(因為我們現在處於根級,因此忽略這個引數)。第二個引數是傳遞給GraphQL呼叫的args, 因此我們可以從中提取出 id
欄位。第三個引數是GraphQL的上下文,通過它我們就能夠提取出從應用中其他地方的 rootValue
所傳遞下來的後端例項,然後按照ID來獲取播放列表(Playlist)。
It's that simple! We load the playlist from the database, return a JS object, and we're done at this level.
木有錯,就是這麼簡單!我們從資料庫載入playlsit,返回一個JS物件,就在這個層面搞定。
Next, let's define the playlist schema type:
接下來,我們來定義一下playlist的schema型別:
import {GraphQLString, GraphQLArray, GraphQLObjectType} from 'graphql' import trackType from './trackType' export default new GraphQLObjectType({ name: 'Playlist', description: 'A Playlist', fields: () => ({ id: { type: GraphQLString, resolve: it => it.uuid, }, name: {type: GraphQLString}, tracks: { type: new GraphQLArray(trackType), resolve: it => it.tracks(), }, }), })
So, here we define a new object type for Playlist. Since our root query resolver returned the playlist model instance, the first argument to our resolve functions at this level (named it
) is that instance. So, for the id field, we are resolving by calling it.uuid
thus exposing the uuid model field under the name id
. Remember that your GraphQL schema does not need to mirror your database schema.
所以,這裡我們為playlist定義了一個新的物件型別。由於我們的根查詢解析器(resolver)返回了playlist的模型例項,因此我們在該級別(名為 it
)的解析函式的第一個引數就是該例項。因此,對於id欄位,我們通過呼叫 it.uuid
來解析,從而暴露了 id
下的uuid模型欄位。請記住,GraphQL架構不需要映象你的資料庫schema。
For the name
field, we do not provide a resolver, because the default for a field named x
is model.x
.
對於 name
欄位,我們不提供解析器,因為名為 x
的欄位的預設值是 model.x
;
For tracks
, we call it.tracks()
on the model to load tracks from the database.
至於 tracks
欄位,我們在模型上呼叫 it.tracks()
來從資料庫載入資料。
Note: there is a resolve function for every field, but this does not mean that an individual database query is required to fetch each field. You can fetch as much or as little on root.playlist
, so each of the sub field resolvers can return something already fetched by their parent or issue further queries as necessary.
注意:每個欄位都有一個resolve函式與之對應,但這並不意味著對每個欄位都要進行一個獨立的資料庫查詢。你可以在 root.playlist
上獲取儘可能多的內容,因此每個子欄位的解析器都可以返回其父級中已經提取的內容,或者根據需要發出進一步的查詢。
Finally, let's define the GraphQL object type for a track:
最後,讓我們為track定義一下GraphQL物件型別。
import {GraphQLString, GraphQLBoolean, GraphQLObjectType} from 'graphql' export default new GraphQLObjectType({ name: 'Track', description: 'A Track', fields: () => ({ id: { type: GraphQLString, resolve: it => it.uuid, }, title: {type: GraphQLString}, viewerHasLiked: { type: GraphQLBoolean, resolve: ( it, _, { rootValue: { ctx: {auth}, }, }, ) => (auth.isAuthenticated ? it.userHasLiked(auth.user) : null), }, }), })
Similar to before, we define the id
and title
fields as simple resolvers. We also add a field viewerHasLiked
and check authentication. If the user has not been authenticated, we return null
. Otherwise we call track.userHasLiked()
with the currently authenticated user. Again, the auth
object is coming from our app outside of GraphQL in an Express middleware.
與之前相類似,我們將 id
和 title
欄位定義為簡單的解析器。我們還添加了一個欄位 viewerHaslinked
並檢查身份驗證。如果使用者尚未通過身份驗證,那麼返回 null
。如若已經驗證,那麼我們用當前經過身份驗證的使用者來呼叫 track.userHasLiked()
。同樣的, auth
物件來自GraphQL之外的Express中介軟體。
Given that Playlist.load()
loads a playlist, playlist.tracks()
loads the array of tracks for that playlist from the database, and track.userHasLiked()
queries the database for the existence of an association between a user and a track, then our GraphQL query will resolve correctly and we have essentially duplicated the functionality of the REST API, once we get the other fields defined, omitted here for brevity.
鑑於 Playlist.load()
載入播放列表, playlist.tracks()
載入播放列表的歌曲陣列, track.userHasLiked()
查詢資料庫中是否存在使用者和歌曲之間的關聯,我們的GraphQL查詢操作將能夠進行正確地解析了,一旦我們獲得定義的其他欄位就可以宣告基本上覆制了REST API的功能。
This solves one of our two issues with our REST API: clients can now request only the data they need, beneficial for mobile app performance in a variety of different ways. But we still have the problem of N+1 queries - if we request viewerHasLiked
for all 50 tracks of this playlist, we will get 50 queries. We solved this using a quite ingenious little npm module from Facebook called DataLoader.
這樣就解決了我們 REST API 的兩個問題之一:客戶現在只能請求他們需要的資料,從各種意義上講都有助於改善移動應用的效能。但是我們仍然存在N + 1個查詢的問題 - 如果我們為這個播放列表的所有50個曲目請求 viewerHasLiked
,我們將發出50個查詢。我們使用來自Facebook的一個非常巧妙的小npm庫解決了這個問題,它叫DataLoader。
DataLoader FTW
DataLoader就是王道

graphql/dataloader provides an API that consolidates any calls to load()
in a frame of execution (event loop tick) and then batch-loads data based on the collection of calls. Additionally, it caches results by key, so subsequent calls to load()
with the same arguments return cached directly.
graphql/dataloader 提供了一個API,它可以在執行幀(事件迴圈tick)中合併對 load()
的任何呼叫,然後根據呼叫批量地載入資料。另外,它通過key快取結果,因此後續呼叫帶有相同引數的 load()
會直接返回快取。
So, if we call myDataLoader.load(id)
many different times in a frame of execution, then once that frame completes, the data loader would be provided with an array of all the IDs and can batch-load the requested data. I would highly recommend reading the README to better understand DataLoader's workings.
因此,如果我們在執行幀中多次呼叫 myDataLoader.load(id)
,那麼一旦該幀完成,資料載入器將提供所有ID的陣列,並可以批量地載入所請求的資料。我強烈地建議您閱讀README以更好地瞭解DataLoader的工作原理。
In our case, we can model track.userHasLiked()
as a call to a DataLoader instance designed for resolving the the relationship between a user and track in bulk. Something like this:
在我們的例子中,我們可以將 track.userHasLiked()
作為對DataLoader例項的呼叫,該例項旨在解決使用者(user)和曲目(track)之間的關係。如下:
import DataLoader from 'dataloader' import BaseModel from './BaseModel' const likeLoader = new DataLoader(requests => { // requests is now a an array of [track, user] pairs. // 請求現在變成了一個[track, user]陣列對 // Batch-load the results for those requests, reorder them to match the order of requests and return. // 批量載入這些請求的結果,對它們重新排序以符合請求的順序再將其返回 }) export default class Track extends BaseModel { userHasLiked(user) { return likeLoader.load([this, user]) } }
With this code in place, the 50 calls to likeLoader.load()
will result in one call to the batch load function, meaning that our GraphQL query will now execute 3 database queries rather than 52.
憑藉這段程式碼,對 likeLoader.load()
的50次呼叫就變成了對批量載入函式的一次呼叫,這意味著我們的GraphQL查詢現在將執行3次資料庫查詢而非52次。
As indicated on the DataLoader README, we take this one step further by composing DataLoader instances all the way to the database query level.
正如DataLoader說明文件所示,我們通過將DataLoader例項總是組合到資料庫查詢來更進一步。
For example, if we wanted to fetch users by username, we would have:
例如,如果我們想要通過使用者姓名來取回使用者資料,我們可以:
batchQueryLoader
- a DataLoader with caching disabled that accepts database queries, executes them against the database (using batch / parallel features for performance speedups), and returns the results.
userByIDLoader
- a DataLoader that accepts IDs, uses batchQueryLoader
to query the database, and returns user objects.
userByUsernameLoader
- a DataLoader that accepts usernames, uses batchQueryLoader
to query the database for user IDs, then calls userByIDLoader
to return user objects.
-
batchQueryLoader
- 這是一種禁用快取的DataLoader,它接受資料庫查詢欄位,執行它們(使用批處理/並行功能實現效能加速)並返回結果。 -
userByIDLoader
- 這是一個接受ID的DataLoader,使用batchQueryLoader
來查詢資料庫,並返回使用者物件。 -
userByUsernameLoader
- 這是一個接受使用者名稱的DataLoader,使用batchQueryLoader
在資料庫中查詢使用者ID,然後呼叫userByIDLoader
返回使用者資料。
With this DataLoader composition, the batchQueryLoader, used by all other DataLoaders, ensures database activity is batched and latency is reduced. And since userByUsernameLoader
resolves IDs then calls userByIDLoader
, userByIDLoader
becomes a shared cache, reducing queries overall. In our setup, we even added a DataLoader for Redis using pipelines and integrated it into our other loaders as a caching layer, further reducing query time.
通過把DataLoader組合起來,所有其他DataLoader使用的batchQueryLoader可以確保批量處理資料庫活動並減少延遲。由於 userByUsernameLoader
解析ID然後呼叫 userByIDLoader
, userByIDLoader
成為共享快取,從而減少了整體查詢。在我們的設定中,我們甚至使用pipeline為Redis添加了一個DataLoader,並將其作為快取層整合到了我們的其他loader中,又進一步地縮短了查詢時間。
Also, as mentioned before, DataLoaders cache their results by the arguments of load()
. Because of this fact, we initialize DataLoaders for each request, so during the life of a single request, data is cached, then it is discarded after the request completes.
另外,如前文所述,DataLoaders通過 load()
的引數快取它們的結果。基於這個事實,我們為每個請求都初始化DataLoaders,因此在單個請求的生命週期中,資料被快取,然後在請求完成後就被丟棄了。
Using this architecture, the entire requested playlist from the beginning, the one that took 170 queries and around 15s to render, returns in about 250ms with only 3 database queries, and around 17ms reading data from the Redis cache. This solves both performance issues.
使用這種架構,從一開始就請求整個playlist,也就是170個查詢和大約15秒渲染的播放列表,在250毫秒內就返回了,只運行了3個數據庫查詢,其中大約17毫秒是從Redis快取中讀取資料。這樣就解決了那兩個效能問題。
Future Puzzles
未來的難題
Mutations (Writes)
變更操作(寫入操作)
Our GraphQL server provides read capabilities for our entire API surface, but writes have yet to be implemented. graphql-js provides an easy DSL for handling GraphQL mutations, so we shortly will be integrating writes into the GraphQL system. This appears to be a straightforward task, but it will be interesting to discover what if any insights or best practices emerge from the implementation.
我們的GraphQL伺服器為整個API表面都提供了讀取功能,但尚未實現寫入操作。graphql-js為處理GraphQL變更提供了簡單的DSL,所以我們很快就會將寫入操作整合到GraphQL系統中。這似乎是一項簡單的任務,但如果從實踐中取得收穫,那將會很有趣。
Client-side Caching
客戶端快取
We have yet to solve caching GraphQL responses on the client. Ideally the system fetching data from a GraphQL endpoint would understand the underlying schema by utilizing schema introspection and thus would be able to intelligently cache sub-resources, so updates to a model at one location would update everywhere. Further considerations like TTLs, forced updates, etc. would need to be implemented.
我們尚未在客戶端上解決快取GraphQL返回資料的問題。理想情況下,從GraphQL端獲取資料的系統將通過schema自檢來理解底層的schema,從而能夠智慧地快取子資源,因此在一個位置對Model進行更新的話也將在其他所有地方更新,所以需要進一步考慮TTL、強制更新等情況。
If understand correctly, Relay may solve some of these concerns, however Relay is still new, does not currently support React Native, and does not run in a native code environment.
如果理解是正確的,Relay可能會解決其中的一些問題,但Relay仍然是新的,目前還不支援React Native,也無法在原生環境下執行。
Real-time or Push Updates
實時或推送更新
There are several aspects of our platform that are "real-time," and it would be awesome to integrate these aspects into our GraphQL backend, perhaps allowing live "subscriptions" to particular sets of data.
我們的平臺有幾個方面是“實時的”,將這些方面整合到我們的GraphQL後端中可能很棒,從而實現對特定資料集的實時“訂閱”。
Query Performance Protection
查詢效能保護
If we expose something like the followers of a given user, then theoretically a malicious client could submit a request like user.followers.followers... until the server struggled to respond. We do not have a full solution for this yet, especially if we decide to expose our GraphQL endpoints as a public API at some future point. Three possible paths to explore come to mind:
如果我們暴露某些使用者的關注者之類的資料,那麼理論上就會有別有用心之人惡意地提交像user.followers.followers這樣的請求.....直到伺服器哭著把資料返回。我們尚沒有完整的解決方案,特別是如果我們決定在未來某個時候將我們的GraphQL介面公開為公共API的時候。我想到了3條可能的前進方向:
1. Perform schema AST inspection to validate the query is not too "complex," rejecting queries over a threshold.
2. Have some form of query "timeout," kill requests that take too long to resolve, and rate-limit the ability of a single request to query the database.
3. Take a note from Facebook and implement a "query cache" where queries are stored in a cache and clients refer to them by ID in production rather than passing the full query, essentially whitelisting queries. This only works if the GraphQL API is only for internal clients.
- 啟動Schema的AST檢查以此來驗證查詢是否太過“複雜”,並拒絕超過閾值的查詢。
- 有一些形式的查詢“超時”了,對於這種請求,可以考慮幹掉,並限制單個請求查詢資料庫的能力。
- 給Facebook提建議實現“查詢快取”,其中查詢儲存在快取中,客戶端通過生產中的ID引用它們,而非傳遞完整的查詢,基本上來說是將查詢列入白名單。這僅適用於內部客戶端的GraphQL API。
Conclusion
結論
In conclusion, GraphQL is pretty awesome and has been solving some real-world problems at Playlist. For us, it is more than hype, and I wanted to share some of our findings in the hopes that it may help others understand. Cutting edge technologies and projects are fun, but can sometimes be difficult to comprehend and apply.
一言以蔽之,GraphQL棒極了,並且真的已經解決了Playlist中的一些現實問題。對我們來說,這不僅僅是幫它吹一波,我更想分享一些我們的發現,希望它可以幫助別人理解GraphQL。尖端技術和專案很有趣,但也有難以理解和採坑的風險。