پرسه‌زنی - در گولنگ برای پاس دادن structها از pointer استفاده کنیم یا نه

hossein · آپریل 8, 2023, 11:50ق.ظ

نوشتن مقاله خیلی حوصله میخواد، اما گاها پرسه‌زنی توی اینترنت می‌کنم در مورد یه موضوع خاص، سعی می‌کنم خلاصه ارزیابی‌هایی که برای خودم انجام میدم در اختیار شما قرار بدم، اسم‌شو بذاریم «پرسه‌زنی».

سوال اینه: از pointer استفاده کنیم یا نکنیم؟ مخصوصا برای پاس دادن struct.

من نتیجه شخصی م از بررسی هام اینه که استفاده نکن، مگه اینکه واقعا اون structی که داری به عنوان خروجی تابع returnش میکنی، بخوای state یکی از fieldهاش رو بعدا آپدیت کنی

در مورد performance هم خیلی موافق این نیستم که بخوایم اینطوری ریز به ریز در مورد performance فکر کنیم. خیلی وقت ها شما یه کدی میزنید که خیلی کنده، ولی در عمل اون کد کند قراره سالی یکبار اجرا بشه، خب چرا باید به بهینه سازی ش فکر کنید؟!

در مورد performance خیلی معیارهای دیگه هم هست که باید در نظر گرفت، ولی حتی اگه performanceهم مد نظر باشه استفاده نکردن از pointer باعث میشه کامپایلر جای heap در stack به شما حافظه بده و این باعث میشه allocation و deallocation سریعتر اتفاق بیفته و gc هم کمتر درگیر میشه.

حالا بریم سراغ نتایج پرسه‌زنی!

این مقاله خیلی عمیق و درست توضیح داده که چرا گولنگ کلا pass by value هست.

Go is pass-by-value — but it might not always feel like it

خود dave cheney هم مقاله ای داره در این مورد.

There is no pass-by-reference in Go

این خیلی نکته مهمیه.

Unlike C++, each variable defined in a Go program occupies a unique memory location.
It is not possible to create a Go program where two variables share the same storage location in memory.
Go does not have pass-by-reference semantics because Go does not have reference variables.

اینجا جان کلام رو گفته.

Despite copying a struct with several fields is slower than copying a pointer to the same struct, returning a struct value may be faster than returning a pointer if we consider escape analysis particularities.

Returning structs allows the compiler to detect that the created data does not escape the function scope and decides to allocate the memory in the stack, where the allocation and deallocation is very cheap if compared to managing memory in the heap.

اینجا هم گفته.

First, Go technically has only pass-by-value. When passing a pointer to an object, you’re passing a pointer by value, not passing an object by reference. The difference is subtle but occasionally relevant. For example, you can overwrite the pointer value which has no impact on the caller, as opposed to dereferencing it and overwriting the memory it points to.

این مقاله هم از محمد حسینی راد عزیز بنچمارک گرفته و توضیحات خوبی داده
از جمله اینکه pointer میره تو heap و هر چند وقت یکبار gc یه چند میلی ثانیه اجرای اپلیکیشن رو متوقف میکنه برای کار خودش و همچنین دسترسی به stack خیلی سریعتره تا heap پس واقعا فقط جایی که لازمه باید pointer یک متغیر رو پاس داد.

اینجا هم میگه سعی کن خود struct رو پس بدی نه pointerش رو مگه اینکه خیلی structت بزرگ باشه یا بخوای واقعا مقدارش رو تغییر بدی.

اینجا هم توضیحات خیلی خوبی داده

Is the struct reasonably large? Prematurely optimising is rarely good, but if you have a struct with more than a handful of fields and/or fields containing large strings or byte arrays (e.g. a markdown body) the benefits to returning a pointer instead of a copy becomes more apparent.
What’s the risk of the returning function mutating the struct (or object) after it returns it? (i.e. some long-running task or job)

مخصوصا این نکته

I nearly always return a pointer from a constructor as the constructor should be run once

ویلیام کندی هم جواب داده اینجا

I have a bit of a different philosophy. Mine is based on making a decision about the type you are defining. Ask yourself a single question. If someone needs to add or remove something from a value of this type, should I create a new value or mutate the existing value. This idea comes from studying the standard library.

یه نفر هم دیدگاه کندی رو تایید کرده.

I like William’s point in his blog post - you very rarely need to worry about the performance. Usually, just use what makes sense otherwise.

خود ویلیام کندی هم مقاله جالبی داره در این مورد.
اولا که میگه performance رو دغدغه ت نکن، چیزایی مثل idiomatic بودن و readable بودن رو اولویت بده.

don’t make coding decisions based on unfounded thoughts you may have about performance. Make coding decisions based on the code being idiomatic, simple, readable and reasonable.

بعدش هم میگه من یه discovery روی کتابخونه های استاندارد انجام دادم و برای سه دسته بندی مختلف ارزیابی کردم. ``` These type classifications are built-in, struct and reference types ``` که string رو هم built-in type در نظر گرفته که immutable هم هست. ``` In general, don’t share built-in type values with a pointer. ``` اینجا میگه که اگه Struct رو ساختی که مثل primitive data type باهاش برخورد کنی کپی ش رو پس بده، اما در غیر اینصورت pointer پس بده، میگه که داشتن factory function برای اون struct نشانه خوبیه برای پس دادن pointer.

If you review more code from the standard library, you will see how struct types are either implemented as a primitive data value like the built-in types or implemented as a value that needs to be shared with a pointer and never copied. The factory functions for a given struct type will give you a great clue as to how the type is implemented.
In general, share struct type values with a pointer unless the struct type has been implemented to behave like a primitive data value.

طبق توضیحات اضافه ش اگه struct قرار نیست داده ش تغییر کنه مثل structهای request و responseی که میسازم و return میکنیم باز خودش رو پاس بده نه pointerش رو.

چیزی که من متوجه شدم اگه struct داره یه stateی مثل file رو نگه میده خب باید pointerش برگرده، که کسی بعدا نتونه file handler این فایل واقعی ای که سیستم عامل داره بهش اشاره میکنه رو به file handler یه فایل دیگه تبدیل کنه. پس اینجا تو خود پکیج os هم تابع Open داره *File پس میده و این درسته.

If you are still not sure, this is another way to think about. Think of every struct as having a nature. If the nature of the struct is something that should not be changed, like a time, a color or a coordinate, then implement the struct as a primitive data value. If the nature of the struct is something that can be changed, even if it never is in your program, it is not a primitive data value and should be implemented to be shared with a pointer. Don’t create structs that have a duality of nature.

در مورد reference type ها هم میگه که خیلی کمیابه جایی که لازم باشه pointer پس داده بشه (هر چند ما در گولنگ reference type نداریم ولی خب منظورش value type هایی هست که value خودش reference به یه object دیگه ست).

Reference types are slices, maps, channels, interface and function values. These are values that contain a header value that references an underlying data structure via a pointer and other meta-data. We rarely share reference type values with a pointer because the header value is designed to be copied. The header value already contains a pointer which is sharing the underlying data structure for us by default.
If you review more code from the standard library, you will see how values from reference types in most cases are not shared with a pointer. Since the reference type contains a header value whose purpose is to share an underlying data structure, sharing these values with a pointer is unnecessary. There is already a pointer in use.
In general, don’t share reference type values with a pointer unless you are implementing an unmarshal type of functionality.

اینم میگه خیلی کم پیش میاد pointer استفاده کنه به دلایل مختلفی.

همچنین میگه منظور از big struct چیه؟

که این مقاله بنچمارک گرفته واقعا باید خیلی خیلی بزرگ باشه Struct که pointer عملکرد بهتری داشته باشه

it turns out that only extremely large structs can cause slower performance compared to pointers.

این [بنچمارک](https://philpearl.github.io/post/bad_go_pointer_returns/) هم خوبه که میگه در موارد خیلی نادری باید جای struct یک pointer ازش رو پس داد.

اینجا هم حرف خیلی قشنگی زده، میگه

The stack is your friend if you care about performance.

و اینم همون استنباط من هست که non-pointer پس بده مگه بدونی بعد از construct شدنش قراره stateش عوض بشه

My general rule of thumb: return a non-pointer unless you know the instance will need to change after construction.

این [حرف](https://the-zen-of-go.netlify.app/) رو هم من خیلی قبول دارم که واقعا اول اثبات کن که مشکل performanceی اپلیکیشن تو از این pointer بودن و نبودنه ست… بعد به فکر حل کردنش باش.

مثلا اگه داری یه فایل چند ده مگی رو ذخیره میکنی تو یه متغیر، آره واقعا مهمه که بهش فکر کنی چون خیلی تفاوت ش بارزه، اما اگه اینطوری نیست، واقعا شاید مشکل performanceی اپلیکیشن ت یه جای دیگه باشه…

As for the performance question: The Zen of Go applies: If you think it’s slow, first prove it with a benchmark. In practice, I have never had to care about the level of performance where the difference between a pointer and a non-pointer mattered. I believe that the instances where that might matter are very rare.

یه ویدیو یوتیوب هم معرفی کرده:

Understanding Allocations: the Stack and the Heap - GopherCon SG 2019

تو خود CodeReviewComments هم میگه pass by value کن مگه اینکه struct بزرگ باشه، که این بزرگ بودنه خودش خیلی سواله، که منظور چقدر بزرگه؟ ولی به نظر میاد منظور خیلی خیلی بزرگه!