rpc: avoid copying into UniValue #30115

theuni · 2024-05-15T18:52:46Z

These are the simple (and hopefully obviously correct) copies that can be moves instead.

This is a follow-up from #30094 (comment)

As it turns out, there are hundreds of places where we copy UniValues needlessly. It should be the case that moves are always preferred over copies, so there should be no downside to these changes.

willcl-ark, however, noticed that memory usage may increase in some cases. Logically this makes no sense to me. The only plausible explanation imo is that because the moves are faster, more ops/second occur in some cases.

This list of moves was obtained by changing the function signatures of the UniValue functions to accept only rvalues, then compiling and fixing them up one by one. There still exist many places where copies are being made. These can/should be fixed up, but weren't done here for the sake of doing the easy ones first.

I ran these changes through clang-tidy with performance-move-const-arg and bugprone-use-after-move and no bugs were detected (though that's obviously not to say it can be trusted 100%).

As stated above, there are still lots of other less trivial fixups to do after these including:

Using non-const UniValues where possible so that moves can happen
Refactoring code in order to be able to move a UniValue without introducing a use-after-move
Refactoring functions to accept UniValues by value rather than by const reference

DrahtBot · 2024-05-15T18:52:49Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	achow101, ryanofsky, willcl-ark
Concept ACK	TheCharlatan, hebasto, laanwj, stickies-v

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#30058 (Encapsulate warnings in generalized node::Warnings and remove globals by stickies-v)
#29039 (versionbits refactoring by ajtowns)
#28710 (Remove the legacy wallet and BDB dependency by achow101)
#27638 (rpc: show P2(W)SH redeemScript in getrawtransaction rpc: show P2(W)SH redeemScript in getrawtransaction #27637 by Riahiamirreza)
#25979 ([WIP] wallet: standardize change output detection process by furszy)
#21283 (Implement BIP 370 PSBTv2 by achow101)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

ryanofsky

Code review ACK 35b5b0f. I confirmed this change is only adding std::moves, and also tried to look for cases where an object was being used after a move, since that seems like the most likely type of bug this PR could introduce. I didn't check exhaustively, though, since that would be a lot more work and I think we have a clang-tidy check that would catch those cases.

theuni · 2024-05-16T01:44:46Z

I didn't check exhaustively, though, since that would be a lot more work and I think we have a clang-tidy check that would catch those cases.

Indeed, I believe c-i should be doing that check. But just in case, from the PR description:

I ran these changes through clang-tidy with performance-move-const-arg and bugprone-use-after-move and no bugs were detected (though that's obviously not to say it can be trusted 100%).

TheCharlatan · 2024-05-16T11:56:01Z

Concept ACK

Could this be a case for a clang-tidy plugin?

theuni · 2024-05-16T15:45:18Z

Could this be a case for a clang-tidy plugin?

I considered this, but I don't think so. UniValue copies are reasonable in many cases, but our use of them often lends itself to moving. So we can't detect and disable copies as a rule, and (I assume) if clang-tidy could detect and suggest possible moves as optimizations, it would be offering a generic version of that already.

Here's an upstream discussion about it that has apparently gone stale: llvm/llvm-project#53489

hebasto · 2024-05-16T15:45:51Z

Concept ACK.

ryanofsky · 2024-05-16T15:57:49Z

Could this be a case for a clang-tidy plugin?

I think the ideal thing to do for types that are potentially expensive to copy is to disable implicit copies, but provide explicit T Copy() const and void CopyFrom(const T&) methods so copies can be made where needed. (There was a DISABLE_IMPLICIT_COPIES macro proposed in #24641 that tried to do this, but it didn't seem possible to generalize the implementation so we never added it.) It think it would be probably be reasonable to allow univalue to be explicitly but not implicitly copied, but I'm sure @theuni would have a better intuition.

laanwj · 2024-05-16T18:29:01Z

Concept ACK.

Agree with @ryanofsky that if copy is something expensive (or generally undesirable) to do, it would make sense to make copy explicit, so that it is always a deliberate choice and stands out in code review.

stickies-v · 2024-05-17T09:39:01Z

Concept ACK

Indeed, I believe c-i should be doing that check.

It does:

bitcoin/src/.clang-tidy

Line 6 in 2f53f22

bugprone-use-after-move,

willcl-ark · 2024-05-17T12:56:43Z

Concept ACK.

The code changes all look correct to me too, however

@willcl-ark, however, noticed that memory usage may increase in some cases. Logically this makes no sense to me. The only plausible explanation imo is that because the moves are faster, more ops/second occur in some cases.

I still have not been able to explain why i see these type of results, but I think it's most likely user error somewhere along the line.

Here is one example of a puzzling result I have, in this test fetching every 5000th block via REST (previously largely addressed by #30094). In all screenshots this PR is on the right hand side:

We can see here this PR made 96m allocations, vs 137m, a strict improvement?

The unnecessary calls to allocator functions are clearly gotten rid of:

However despite that total resource usage increases, albeit only a little, ~10MB (also see first image) :

The changes in this PR which affect this test (since #30094) are in core_write.cpp

I guess this may just be allocator stuff I might never get to the bottom of, but I will continue to investigate this for a little bit before bringing a full ACK back :)

FWIW these changes did provide a measurable speedup on my test, roughly of this order:

this PR: ~ 1:43
master: ~ 1:55

I also see similar result just fetching a single block (in this case block 800,000). Many fewer allocations, but larger total usage:

I think the clues may lie in here, but I am too dumb to know what they are. For some reason after this PR a second ~30MB allocation is made. Without this PR there are 37.2MB and 24.0MB allocations made,, and with this PR those are 38.2MB and 30.9MB, and I don't understand why:

heaptrack dumps in case anyone else wants to take a look:

heaptrack_dumps.zip

theuni · 2024-05-17T17:00:32Z

@willcl-ark Thanks, that's very helpful. Out of curiosity, what build options are you using for these? Are optimizations on?

maflcko · 2024-05-20T15:05:51Z

Maybe rebase and re-bench after 75118a6, which replaced some copies with moves as well?

theuni · 2024-05-20T15:23:23Z

Maybe rebase and re-bench after 75118a6, which replaced some copies with moves as well?

Ah, great, thanks for pointing me to that. I had pretty much this same commit locally to address the third point in this PR's description:

Refactoring functions to accept UniValues by value rather than by const reference

Though I think there are still other functions that need the same treatment.

These are simple (and hopefully obviously correct) copies that can be moves instead.

theuni · 2024-05-22T15:47:30Z

I think it's not really worth worrying about benchmarks here, I think it's pretty clear that moving can only be better.

I'd like to get this in if there are no objections, as there's some more low-hanging fruit wrt UniValue moves.

achow101 · 2024-05-22T17:57:41Z

ACK d7707d9

Checked that only std::moves were added, and letting the compiler warn for use after move, which there does not appear to be any.

ryanofsky

Code review ACK d7707d9. No changes since last review other than rebase. I agree benchmarks showing increased peak memory usage and RSS are surprising, but number of allocations is down as expected, and runtime is also decreased.

The fact that "peak allocations" appear to be larger after the change seems interesting. Before the change, the largest allocations are 38.4MB, 26.7MB, and 8.4MB, but after the change they are 39.4MB, 33.7MB, and 10.8MB. I'm not sure what could account for this though.

In general, it seems like using std::move could increase memory usage with vectors, because if you std::move into a vector, the new vector has the same capacity as the existing vector, but when you copy into a vector, the library could choose a smaller capacity just matching the size, and overall memory usage would be lower after the first vector is destroyed.

sipa · 2024-05-22T20:08:03Z

It's unclear to me whether the UniValue type has a move constructor, actually. If not, then these std::moves have no effect.

@ryanofsky If you theory is correct, then having a way to invoke shrink_to_fit on UniValue's internal vector might be useful.

theuni · 2024-05-22T20:16:37Z

It's unclear to me whether the UniValue type has a move constructor, actually. If not, then these std::moves have no effect.

@sipa I worked up a quick bench to test exactly that: theuni@35d5ffc

The numbers look as you'd expect.

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        7,273,304.00 |              137.49 |    0.1% |      0.08 | `UniValueAssignCopy`
|        1,040,156.00 |              961.39 |    0.5% |      0.02 | `UniValueAssignMove`
|        9,336,776.00 |              107.10 |    0.4% |      0.10 | `UniValueConstructCopy`
|          991,626.00 |            1,008.44 |    0.7% |      0.02 | `UniValueConstructMove`
|       11,783,927.00 |               84.86 |    0.2% |      0.13 | `UniValuePushKVCopy`
|        2,344,899.00 |              426.46 |    0.3% |      0.03 | `UniValuePushKVMove`

maflcko · 2024-05-23T10:08:15Z

It's unclear to me whether the UniValue type has a move constructor, actually. If not, then these std::moves have no effect.

Since this question keeps coming up (#24542 (comment)), what about adding static_assert(std::is_move_constructible_v<UniValue>); somewhere?

maflcko · 2024-05-23T11:02:58Z

@ryanofsky If you theory is correct, then having a way to invoke shrink_to_fit on UniValue's internal vector might be useful.

Not sure if this is ideal. UniValue is a recursive structure, so calling it on the top level vector only shouldn't cause any difference? Similarly, if shirking is done recursively, the runtime overhead will be equal to that of a copy, so might as well just do a copy instead? Finally, whenever a UniValue would be ready to shrink, it is usually one step away from being deleted completely anyway. The only missing step is to call write() on it to convert it to bytes to send on the wire. However that leads to an alternative, where a new write_and_destroy function is added, which recursively shrinks (deletes) the UniValue as soon as it was written to the stream. From a runtime-overhead this should be free (as the univalue is destroyed afterward anyway). I haven't tested or implemented this, but this may be a start:

diff

diff --git a/src/common/args.cpp b/src/common/args.cpp
index c90eb0c685..e66f18277d 100644
--- a/src/common/args.cpp
+++ b/src/common/args.cpp
@@ -248,7 +248,7 @@ bool ArgsManager::ParseParameters(int argc, const char* const argv[], std::strin
         const common::SettingsSpan values{*includes};
         // Range may be empty if -noincludeconf was passed
         if (!values.empty()) {
-            error = "-includeconf cannot be used from commandline; -includeconf=" + values.begin()->write();
+            error = "-includeconf cannot be used from commandline; -includeconf=" /*+ values.begin()->write_const()*/;
             return false; // pick first value as example
         }
     }
@@ -811,7 +811,7 @@ void ArgsManager::logArgsPrefix(
         for (const auto& value : arg.second) {
             std::optional<unsigned int> flags = GetArgFlags('-' + arg.first);
             if (flags) {
-                std::string value_str = (*flags & SENSITIVE) ? "****" : value.write();
+                std::string value_str = (*flags & SENSITIVE) ? "****" : ""/*value.write_const()*/;
                 LogPrintf("%s %s%s=%s\n", prefix, section_str, arg.first, value_str);
             }
         }
@@ -825,7 +825,7 @@ void ArgsManager::LogArgs() const
         logArgsPrefix("Config file arg:", section.first, section.second);
     }
     for (const auto& setting : m_settings.rw_settings) {
-        LogPrintf("Setting file arg: %s = %s\n", setting.first, setting.second.write());
+        LogPrintf("Setting file arg: %s = %s\n", setting.first, ""/*setting.second.write_const()*/);
     }
     logArgsPrefix("Command-line arg:", "", m_settings.command_line_options);
 }
diff --git a/src/common/settings.cpp b/src/common/settings.cpp
index c1520dacd2..268ab39f1e 100644
--- a/src/common/settings.cpp
+++ b/src/common/settings.cpp
@@ -99,7 +99,7 @@ bool ReadSettings(const fs::path& path, std::map<std::string, SettingsValue>& va
     file.close(); // Done with file descriptor. Release while copying data.
 
     if (!in.isObject()) {
-        errors.emplace_back(strprintf("Found non-object value %s in settings file %s", in.write(), fs::PathToString(path)));
+        errors.emplace_back(strprintf("Found non-object value %s in settings file %s", in.write_and_destroy(), fs::PathToString(path)));
         return false;
     }
 
@@ -138,7 +138,7 @@ bool WriteSettings(const fs::path& path,
         errors.emplace_back(strprintf("Error: Unable to open settings file %s for writing", fs::PathToString(path)));
         return false;
     }
-    file << out.write(/* prettyIndent= */ 4, /* indentLevel= */ 1) << std::endl;
+    file << out.write_and_destroy(/* prettyIndent= */ 4, /* indentLevel= */ 1) << std::endl;
     file.close();
     return true;
 }
diff --git a/src/httprpc.cpp b/src/httprpc.cpp
index 3eb34dbe6a..91b1574afc 100644
--- a/src/httprpc.cpp
+++ b/src/httprpc.cpp
@@ -87,7 +87,7 @@ static void JSONErrorReply(HTTPRequest* req, UniValue objError, const JSONRPCReq
     else if (code == RPC_METHOD_NOT_FOUND)
         nStatus = HTTP_NOT_FOUND;
 
-    std::string strReply = JSONRPCReplyObj(NullUniValue, std::move(objError), jreq.id, jreq.m_json_version).write() + "\n";
+    std::string strReply = JSONRPCReplyObj(NullUniValue, std::move(objError), jreq.id, jreq.m_json_version).write_and_destroy() + "\n";
 
     req->WriteHeader("Content-Type", "application/json");
     req->WriteReply(nStatus, strReply);
@@ -273,7 +273,7 @@ static bool HTTPReq_JSONRPC(const std::any& context, HTTPRequest* req)
             throw JSONRPCError(RPC_PARSE_ERROR, "Top-level object parse error");
 
         req->WriteHeader("Content-Type", "application/json");
-        req->WriteReply(HTTP_OK, reply.write() + "\n");
+        req->WriteReply(HTTP_OK, reply.write_and_destroy() + "\n");
     } catch (UniValue& e) {
         JSONErrorReply(req, std::move(e), jreq);
         return false;
diff --git a/src/rest.cpp b/src/rest.cpp
index 9fc5d4af04..f83133e187 100644
--- a/src/rest.cpp
+++ b/src/rest.cpp
@@ -269,7 +269,7 @@ static bool rest_headers(const std::any& context,
         for (const CBlockIndex *pindex : headers) {
             jsonHeaders.push_back(blockheaderToJSON(*tip, *pindex));
         }
-        std::string strJSON = jsonHeaders.write() + "\n";
+        std::string strJSON = jsonHeaders.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -338,7 +338,7 @@ static bool rest_block(const std::any& context,
         DataStream block_stream{block_data};
         block_stream >> TX_WITH_WITNESS(block);
         UniValue objBlock = blockToJSON(chainman.m_blockman, block, *tip, *pblockindex, tx_verbosity);
-        std::string strJSON = objBlock.write() + "\n";
+        std::string strJSON = objBlock.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -472,7 +472,7 @@ static bool rest_filter_header(const std::any& context, HTTPRequest* req, const
             jsonHeaders.push_back(header.GetHex());
         }
 
-        std::string strJSON = jsonHeaders.write() + "\n";
+        std::string strJSON = jsonHeaders.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -564,7 +564,7 @@ static bool rest_block_filter(const std::any& context, HTTPRequest* req, const s
     case RESTResponseFormat::JSON: {
         UniValue ret(UniValue::VOBJ);
         ret.pushKV("filter", HexStr(filter.GetEncodedFilter()));
-        std::string strJSON = ret.write() + "\n";
+        std::string strJSON = ret.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -591,7 +591,7 @@ static bool rest_chaininfo(const std::any& context, HTTPRequest* req, const std:
         jsonRequest.context = context;
         jsonRequest.params = UniValue(UniValue::VARR);
         UniValue chainInfoObject = getblockchaininfo().HandleRequest(jsonRequest);
-        std::string strJSON = chainInfoObject.write() + "\n";
+        std::string strJSON = chainInfoObject.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -634,7 +634,7 @@ static bool rest_deploymentinfo(const std::any& context, HTTPRequest* req, const
         }
 
         req->WriteHeader("Content-Type", "application/json");
-        req->WriteReply(HTTP_OK, getdeploymentinfo().HandleRequest(jsonRequest).write() + "\n");
+        req->WriteReply(HTTP_OK, getdeploymentinfo().HandleRequest(jsonRequest).write_and_destroy() + "\n");
         return true;
     }
     default: {
@@ -685,9 +685,9 @@ static bool rest_mempool(const std::any& context, HTTPRequest* req, const std::s
             if (verbose && mempool_sequence) {
                 return RESTERR(req, HTTP_BAD_REQUEST, "Verbose results cannot contain mempool sequence values. (hint: set \"verbose=false\")");
             }
-            str_json = MempoolToJSON(*mempool, verbose, mempool_sequence).write() + "\n";
+            str_json = MempoolToJSON(*mempool, verbose, mempool_sequence).write_and_destroy() + "\n";
         } else {
-            str_json = MempoolInfoToJSON(*mempool).write() + "\n";
+            str_json = MempoolInfoToJSON(*mempool).write_and_destroy() + "\n";
         }
 
         req->WriteHeader("Content-Type", "application/json");
@@ -747,7 +747,7 @@ static bool rest_tx(const std::any& context, HTTPRequest* req, const std::string
     case RESTResponseFormat::JSON: {
         UniValue objTx(UniValue::VOBJ);
         TxToUniv(*tx, /*block_hash=*/hashBlock, /*entry=*/ objTx);
-        std::string strJSON = objTx.write() + "\n";
+        std::string strJSON = objTx.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -940,7 +940,7 @@ static bool rest_getutxos(const std::any& context, HTTPRequest* req, const std::
         objGetUTXOResponse.pushKV("utxos", utxos);
 
         // return json string
-        std::string strJSON = objGetUTXOResponse.write() + "\n";
+        std::string strJSON = objGetUTXOResponse.write_and_destroy() + "\n";
         req->WriteHeader("Content-Type", "application/json");
         req->WriteReply(HTTP_OK, strJSON);
         return true;
@@ -992,7 +992,7 @@ static bool rest_blockhash_by_height(const std::any& context, HTTPRequest* req,
         req->WriteHeader("Content-Type", "application/json");
         UniValue resp = UniValue(UniValue::VOBJ);
         resp.pushKV("blockhash", pblockindex->GetBlockHash().GetHex());
-        req->WriteReply(HTTP_OK, resp.write() + "\n");
+        req->WriteReply(HTTP_OK, resp.write_and_destroy() + "\n");
         return true;
     }
     default: {
diff --git a/src/rpc/util.cpp b/src/rpc/util.cpp
index f5a2e9eb63..4c1d64eb35 100644
--- a/src/rpc/util.cpp
+++ b/src/rpc/util.cpp
@@ -166,7 +166,7 @@ std::string HelpExampleCliNamed(const std::string& methodname, const RPCArgList&
     for (const auto& argpair: args) {
         const auto& value = argpair.second.isStr()
                 ? argpair.second.get_str()
-                : argpair.second.write();
+                : ""/*argpair.second.write()*/;
         result += " " + argpair.first + "=" + ShellQuoteIfNeeded(value);
     }
     result += "\n";
@@ -187,7 +187,7 @@ std::string HelpExampleRpcNamed(const std::string& methodname, const RPCArgList&
     }
 
     return "> curl --user myusername --data-binary '{\"jsonrpc\": \"2.0\", \"id\": \"curltest\", "
-           "\"method\": \"" + methodname + "\", \"params\": " + params.write() + "}' -H 'content-type: text/plain;' http://127.0.0.1:8332/\n";
+           "\"method\": \"" + methodname + "\", \"params\": " + params.write_and_destroy() + "}' -H 'content-type: text/plain;' http://127.0.0.1:8332/\n";
 }
 
 // Converts a hex string to a public key if possible
@@ -631,7 +631,7 @@ UniValue RPCHelpMan::HandleRequest(const JSONRPCRequest& request) const
         }
     }
     if (!arg_mismatch.empty()) {
-        throw JSONRPCError(RPC_TYPE_ERROR, strprintf("Wrong type passed:\n%s", arg_mismatch.write(4)));
+        throw JSONRPCError(RPC_TYPE_ERROR, strprintf("Wrong type passed:\n%s", arg_mismatch.write_and_destroy(4)));
     }
     CHECK_NONFATAL(m_req == nullptr);
     m_req = &request;
@@ -650,8 +650,8 @@ UniValue RPCHelpMan::HandleRequest(const JSONRPCRequest& request) const
         if (!mismatch.isNull()) {
             std::string explain{
                 mismatch.empty() ? "no possible results defined" :
-                mismatch.size() == 1 ? mismatch[0].write(4) :
-                mismatch.write(4)};
+                mismatch.size() == 1 ? ""/* mismatch[0].write_and_destroy(4) */:
+                mismatch.write_and_destroy(4)};
             throw std::runtime_error{
                 strprintf("Internal bug detected: RPC call \"%s\" returned incorrect type:\n%s\n%s %s\nPlease report this issue here: %s\n",
                           m_name, explain,
@@ -950,7 +950,7 @@ std::string RPCArg::ToDescriptionString(bool is_named_arg) const
     if (m_fallback.index() == 1) {
         ret += ", optional, default=" + std::get<RPCArg::DefaultHint>(m_fallback);
     } else if (m_fallback.index() == 2) {
-        ret += ", optional, default=" + std::get<RPCArg::Default>(m_fallback).write();
+        ret += ", optional, default=" ;//+ std::get<RPCArg::Default>(m_fallback).write();
     } else {
         switch (std::get<RPCArg::Optional>(m_fallback)) {
         case RPCArg::Optional::OMITTED: {
diff --git a/src/univalue/include/univalue.h b/src/univalue/include/univalue.h
index da12157555..cc741eb84f 100644
--- a/src/univalue/include/univalue.h
+++ b/src/univalue/include/univalue.h
@@ -73,7 +73,9 @@ public:
     void getObjMap(std::map<std::string,UniValue>& kv) const;
     bool checkObject(const std::map<std::string,UniValue::VType>& memberTypes) const;
     const UniValue& operator[](const std::string& key) const;
+    UniValue& operator[](const std::string& key);
     const UniValue& operator[](size_t index) const;
+    UniValue& operator[](size_t index);
     bool exists(const std::string& key) const { size_t i; return findKey(key, i); }
 
     bool isNull() const { return (typ == VNULL); }
@@ -94,8 +96,7 @@ public:
     void pushKV(std::string key, UniValue val);
     void pushKVs(UniValue obj);
 
-    std::string write(unsigned int prettyIndent = 0,
-                      unsigned int indentLevel = 0) const;
+    std::string write_and_destroy(unsigned int prettyIndent = 0,                       unsigned int indentLevel = 0) ;
 
     bool read(std::string_view raw);
 
@@ -107,8 +108,8 @@ private:
 
     void checkType(const VType& expected) const;
     bool findKey(const std::string& key, size_t& retIdx) const;
-    void writeArray(unsigned int prettyIndent, unsigned int indentLevel, std::string& s) const;
-    void writeObject(unsigned int prettyIndent, unsigned int indentLevel, std::string& s) const;
+    void writeArray(unsigned int prettyIndent, unsigned int indentLevel, std::string& s);
+    void writeObject(unsigned int prettyIndent, unsigned int indentLevel, std::string& s);
 
 public:
     // Strict type-specific getters, these throw std::runtime_error if the
diff --git a/src/univalue/lib/univalue.cpp b/src/univalue/lib/univalue.cpp
index 656d2e8203..8ceab1fc31 100644
--- a/src/univalue/lib/univalue.cpp
+++ b/src/univalue/lib/univalue.cpp
@@ -197,6 +197,13 @@ const UniValue& UniValue::operator[](const std::string& key) const
     return values.at(index);
 }
 
+UniValue& UniValue::operator[](const std::string& key)
+{
+    size_t index(-1);
+    if (!findKey(key, index)) index=-1;
+    return values.at(index);
+}
+
 const UniValue& UniValue::operator[](size_t index) const
 {
     if (typ != VOBJ && typ != VARR)
@@ -207,6 +214,11 @@ const UniValue& UniValue::operator[](size_t index) const
     return values.at(index);
 }
 
+UniValue& UniValue::operator[](size_t index)
+{
+    return values.at(index);
+}
+
 void UniValue::checkType(const VType& expected) const
 {
     if (typ != expected) {
diff --git a/src/univalue/lib/univalue_write.cpp b/src/univalue/lib/univalue_write.cpp
index 4a2219061a..df10467fb6 100644
--- a/src/univalue/lib/univalue_write.cpp
+++ b/src/univalue/lib/univalue_write.cpp
@@ -28,8 +28,8 @@ static std::string json_escape(const std::string& inS)
 }
 
 // NOLINTNEXTLINE(misc-no-recursion)
-std::string UniValue::write(unsigned int prettyIndent,
-                            unsigned int indentLevel) const
+std::string UniValue::write_and_destroy(unsigned int prettyIndent,
+                            unsigned int indentLevel) 
 {
     std::string s;
     s.reserve(1024);
@@ -59,6 +59,7 @@ std::string UniValue::write(unsigned int prettyIndent,
         break;
     }
 
+*this = {}; // delete
     return s;
 }
 
@@ -68,7 +69,7 @@ static void indentStr(unsigned int prettyIndent, unsigned int indentLevel, std::
 }
 
 // NOLINTNEXTLINE(misc-no-recursion)
-void UniValue::writeArray(unsigned int prettyIndent, unsigned int indentLevel, std::string& s) const
+void UniValue::writeArray(unsigned int prettyIndent, unsigned int indentLevel, std::string& s)
 {
     s += "[";
     if (prettyIndent)
@@ -77,7 +78,7 @@ void UniValue::writeArray(unsigned int prettyIndent, unsigned int indentLevel, s
     for (unsigned int i = 0; i < values.size(); i++) {
         if (prettyIndent)
             indentStr(prettyIndent, indentLevel, s);
-        s += values[i].write(prettyIndent, indentLevel + 1);
+        s += values[i].write_and_destroy(prettyIndent, indentLevel + 1);
         if (i != (values.size() - 1)) {
             s += ",";
         }
@@ -91,7 +92,7 @@ void UniValue::writeArray(unsigned int prettyIndent, unsigned int indentLevel, s
 }
 
 // NOLINTNEXTLINE(misc-no-recursion)
-void UniValue::writeObject(unsigned int prettyIndent, unsigned int indentLevel, std::string& s) const
+void UniValue::writeObject(unsigned int prettyIndent, unsigned int indentLevel, std::string& s)
 {
     s += "{";
     if (prettyIndent)
@@ -103,7 +104,7 @@ void UniValue::writeObject(unsigned int prettyIndent, unsigned int indentLevel,
         s += "\"" + json_escape(keys[i]) + "\":";
         if (prettyIndent)
             s += " ";
-        s += values.at(i).write(prettyIndent, indentLevel + 1);
+        s += values.at(i).write_and_destroy(prettyIndent, indentLevel + 1);
         if (i != (values.size() - 1))
             s += ",";
         if (prettyIndent)

willcl-ark

ACK d7707d9

Whilst I still measure slightly increased resource usage vs master, even after adding some strategic calls to shrink_to_fit() on the Univalue.values member (which likely didn't work for the reasons @maflcko described above), I am now convinced this is not something to worry about in practice.

I've done more measurements over the last few days, and even when testing codepaths which create very large numbers of UniValues, I only ever see a few MB more RAM being used at peak. In the order of single digit percentage points. And, this excess does not appear to grow over time.

Therefore in my opinion this effect, where it's measurable/noticable at all, is more than offset by the reduced number of allocations (e.g. I've seen 106,000,000 allocations being reduced to 60,000,000 in larger tests using e.g. blockToJSON), and the speedups shown using @theuni's benchmark.

ryanofsky · 2024-05-23T14:58:18Z

Went ahead and merged this. It seems the increased peak memory usage is not a major concern, and other metrics like number of allocations and run time do seem improved.

Marco seems right that naively calling shrink_to_fit might not help much, or could be basically equivalent to copying, and there might be better ways to decrease memory usage, such as with a write_and_destroy method.

(Note: I also edited PR description to remove @ from @willcl-ark, so Will doesn't receive github spam when the merge commit is pulled into other repositories.)

theuni · 2024-05-23T15:03:38Z

@ryanofsky Thanks.

I'm going to continue working on the more complicated copies in follow-up PRs. I'll play around with your Copy/CopyFrom idea while I'm at it. If nothing else, that would be a good way to help me track down the remaining cases.

DrahtBot added the RPC/REST/ZMQ label May 15, 2024

ryanofsky approved these changes May 15, 2024

View reviewed changes

DrahtBot added the Needs rebase label May 16, 2024

rpc: avoid copying into UniValue

d7707d9

These are simple (and hopefully obviously correct) copies that can be moves instead.

theuni force-pushed the easy-univalue-moves branch from 35b5b0f to d7707d9 Compare May 20, 2024 16:53

DrahtBot removed the Needs rebase label May 20, 2024

DrahtBot requested review from laanwj, stickies-v, willcl-ark, ryanofsky, hebasto and TheCharlatan May 22, 2024 17:57

ryanofsky approved these changes May 22, 2024

View reviewed changes

willcl-ark approved these changes May 23, 2024

View reviewed changes

DrahtBot mentioned this pull request May 23, 2024

Encapsulate warnings in generalized node::Warnings and remove globals #30058

Open

ryanofsky merged commit 6300438 into bitcoin:master May 23, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rpc: avoid copying into UniValue #30115

rpc: avoid copying into UniValue #30115

theuni commented May 15, 2024 •

edited by ryanofsky

DrahtBot commented May 15, 2024 •

edited

ryanofsky left a comment

theuni commented May 16, 2024 •

edited

TheCharlatan commented May 16, 2024

theuni commented May 16, 2024 •

edited

hebasto commented May 16, 2024

ryanofsky commented May 16, 2024

laanwj commented May 16, 2024 •

edited

stickies-v commented May 17, 2024

willcl-ark commented May 17, 2024

theuni commented May 17, 2024

maflcko commented May 20, 2024

theuni commented May 20, 2024

theuni commented May 22, 2024

achow101 commented May 22, 2024

ryanofsky left a comment

sipa commented May 22, 2024 •

edited

theuni commented May 22, 2024

maflcko commented May 23, 2024

maflcko commented May 23, 2024

willcl-ark left a comment

ryanofsky commented May 23, 2024

theuni commented May 23, 2024

rpc: avoid copying into UniValue #30115

rpc: avoid copying into UniValue #30115

Conversation

theuni commented May 15, 2024 • edited by ryanofsky

DrahtBot commented May 15, 2024 • edited

Code Coverage

Reviews

Conflicts

ryanofsky left a comment

Choose a reason for hiding this comment

theuni commented May 16, 2024 • edited

TheCharlatan commented May 16, 2024

theuni commented May 16, 2024 • edited

hebasto commented May 16, 2024

ryanofsky commented May 16, 2024

laanwj commented May 16, 2024 • edited

stickies-v commented May 17, 2024

willcl-ark commented May 17, 2024

theuni commented May 17, 2024

maflcko commented May 20, 2024

theuni commented May 20, 2024

theuni commented May 22, 2024

achow101 commented May 22, 2024

ryanofsky left a comment

Choose a reason for hiding this comment

sipa commented May 22, 2024 • edited

theuni commented May 22, 2024

maflcko commented May 23, 2024

maflcko commented May 23, 2024

willcl-ark left a comment

Choose a reason for hiding this comment

ryanofsky commented May 23, 2024

theuni commented May 23, 2024

theuni commented May 15, 2024 •

edited by ryanofsky

DrahtBot commented May 15, 2024 •

edited

theuni commented May 16, 2024 •

edited

theuni commented May 16, 2024 •

edited

laanwj commented May 16, 2024 •

edited

sipa commented May 22, 2024 •

edited