Grok 3 AI

Questions on how we spend our money and our time - consumer goods and services, home and vehicle, leisure and recreational activities
Topic Author
JBTX
Posts: 12257
Joined: Wed Jul 26, 2017 12:46 pm

Grok 3 AI

Post by JBTX »

Have any of you tried Grok 3 for financial or other questions like estate planning? I put some pretty detailed estate planning questions in related to our situation (inherited IRAs, special needs trusts ) and it gave a very detailed and accurate (to the best of my knowledge ) and said some things I’ve learned only from some of the best estate planning lawyers here and elsewhere.

You can feed it a lot of detailed info and it does a good job of processing it.

It can also give pretty specific medical diagnosis and other information - which is out of scope here so please don’t discuss.

Has anybody tried it yet for financial or estate planning type questions? This is the first iteration of AI I have found scarily useful. You could probably copy and paste many OP post questions on here and the answer will likely be very good.
Harmanic
Posts: 2521
Joined: Mon Apr 04, 2022 10:19 am

Re: Grok 3 AI

Post by Harmanic »

There are all kinds of AI models out there. For important tasks, it's best to try several to compare the answers. I would not rely on one for anything more than a simple question.
“He who fears he shall suffer, already suffers what he fears.” - Michel de Montaigne
Topic Author
JBTX
Posts: 12257
Joined: Wed Jul 26, 2017 12:46 pm

Re: Grok 3 AI

Post by JBTX »

Harmanic wrote: Tue Mar 11, 2025 6:30 pm There are all kinds of AI models out there. For important tasks, it's best to try several to compare the answers. I would not rely on one for anything more than a simple question.
I haven’t used them much but heard some raves about grok and tried it - from the little bit I’ve tried with others this was a step above.

I wouldn’t rely on one AI any more than one doctor or one lawyer, but of the few things I put in the feedback was that good.
User avatar
nisiprius
Advisory Board
Posts: 55170
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: Grok 3 AI

Post by nisiprius »

Columbia Journalism Review report: AI Search Has A Citation Problem
We Compared Eight AI Search Engines. They’re All Bad at Citing News....
  • Chatbots were generally bad at declining to answer questions they couldn’t answer accurately, offering incorrect or speculative answers instead.
  • Premium chatbots provided more confidently incorrect answers than their free counterparts.
  • Generative search tools fabricated links and cited syndicated and copied versions of articles.
Grok 3 often came out near the bottom. Note that red and pink squares are bad, green squares are good.

Image
Grok-3 Search and Perplexity Pro's responses exhibited less uncertainty and were
more frequently confidently incorrect than their free counterparts in our study.
ChatGPT, Perplexity, and Perplexity Pro exhibited unexpected behaviors given what we know about which publishers allow them crawler access. On some occasions, the chatbots either incorrectly answered or declined to answer queries from publishers that permitted them to access their content. On the other hand, they sometimes correctly answered queries about publishers whose content they shouldn’t have had access to.
The generative search tools we tested had a common tendency to cite the wrong article. For instance, DeepSeek misattributed the source of the excerpts provided in our queries 115 out of 200 times. This means that news publishers’ content was most often being credited to the wrong source....

More than half of responses from Gemini and Grok 3 cited fabricated or broken URLs that led to error pages.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.
Topic Author
JBTX
Posts: 12257
Joined: Wed Jul 26, 2017 12:46 pm

Re: Grok 3 AI

Post by JBTX »

This is a situation I previously started a thread on.

viewtopic.php?t=450015

Last month I collected my dad's tax forms and gave them to their CPA. I also input in parallel into Turbo tax. There was a discrepancy which I determined to be an error on their part by researching and talking to some other sources. I informed the CPA and they wouldn't change it - they said "we verified our inputs are correct". In other words they are just software input monkeys and can't think form themselves. So I just filed myself on TurboTax.

Anyway, I described that situation to Grok, in terms of what was on their 1099R from their retirement plan, which boxes were filled, what values etc and asked if the CPA made an error, and it came back and said they did, and gave all of the right reasons, and even speculated as to how the CPA may have hosed it up.

What was really interesting was the answer was right, but grok mislabeled a line on a specific tax form. After it answered, I pointed out the error to grok, and it confirmed the error. I asked it why it made the error, and gave a very detailed but legible and humanized response - and said it has learned from that error and won't make it again.

I am literally floored by this thing. It is another level.
Last edited by JBTX on Tue Mar 11, 2025 8:08 pm, edited 1 time in total.
Topic Author
JBTX
Posts: 12257
Joined: Wed Jul 26, 2017 12:46 pm

Re: Grok 3 AI

Post by JBTX »

nisiprius wrote: Tue Mar 11, 2025 6:58 pm Columbia Journalism Review report: AI Search Has A Citation Problem
We Compared Eight AI Search Engines. They’re All Bad at Citing News....
  • Chatbots were generally bad at declining to answer questions they couldn’t answer accurately, offering incorrect or speculative answers instead.
  • Premium chatbots provided more confidently incorrect answers than their free counterparts.
  • Generative search tools fabricated links and cited syndicated and copied versions of articles.
Grok 3 often came out near the bottom. Note that red and pink squares are bad, green squares are good.

Image
Grok-3 Search and Perplexity Pro's responses exhibited less uncertainty and were
more frequently confidently incorrect than their free counterparts in our study.
ChatGPT, Perplexity, and Perplexity Pro exhibited unexpected behaviors given what we know about which publishers allow them crawler access. On some occasions, the chatbots either incorrectly answered or declined to answer queries from publishers that permitted them to access their content. On the other hand, they sometimes correctly answered queries about publishers whose content they shouldn’t have had access to.
The generative search tools we tested had a common tendency to cite the wrong article. For instance, DeepSeek misattributed the source of the excerpts provided in our queries 115 out of 200 times. This means that news publishers’ content was most often being credited to the wrong source....

More than half of responses from Gemini and Grok 3 cited fabricated or broken URLs that led to error pages.
I have noticed Grok definitely has a more confident tone, vs the few times I used others in the past. And in the post above I explained an error it made, even though it got the tax question I asked right.

Obviously it isn’t perfect but I am amazed with the accuracy and specificity it has answered on a half dozen questions I have asked it. I’d be interested to see if you gave it all the inputs to your tax forms, if it could create a 1040. I wouldn’t be shocked if it could.
HootingSloth
Posts: 1282
Joined: Mon Jan 28, 2019 2:38 pm

Re: Grok 3 AI

Post by HootingSloth »

As a tax lawyer, I have experimented with several of these large language model AI tools to see what kinds of tax-related questions they can answer. All other ones I have tried were worse than useless, but in noodling around with Grok 3 I was fairly impressed. Not good enough to rely on, but not worthless either.
Last edited by HootingSloth on Tue Mar 11, 2025 8:12 pm, edited 1 time in total.
Building TIPS ladder for all residual needs and some wants after SS, pension, and paid-off house. Other wants from 5% constant percentage from Risk Portfolio (80/20 AA w/ 80% global + 20% US-tilt)
BiggerSaver
Posts: 78
Joined: Sat Jan 25, 2025 8:15 pm

Re: Grok 3 AI

Post by BiggerSaver »

https://www.oneusefulthing.org/p/a-new- ... -claude-37

The author (Ethan Mollick) might be the biggest non technical AI expert out there.
Paul Grouch
Posts: 13
Joined: Thu Feb 25, 2021 7:54 pm

Re: Grok 3 AI

Post by Paul Grouch »

JBTX wrote: Tue Mar 11, 2025 8:01 pm This is a situation I previously started a thread on.

viewtopic.php?t=450015

Last month I collected my dad's tax forms and gave them to their CPA. I also input in parallel into Turbo tax. There was a discrepancy which I determined to be an error on their part by researching and talking to some other sources. I informed the CPA and they wouldn't change it - they said "we verified our inputs are correct". In other words they are just software input monkeys and can't think form themselves. So I just filed myself on TurboTax.

Anyway, I described that situation to Grok, in terms of what was on their 1099R from their retirement plan, which boxes were filled, what values etc and asked if the CPA made an error, and it came back and said they did, and gave all of the right reasons, and even speculated as to how the CPA may have hosed it up.

What was really interesting was the answer was right, but grok mislabeled a line on a specific tax form. After it answered, I pointed out the error to grok, and it confirmed the error. I asked it why it made the error, and gave a very detailed but legible and humanized response - and said it has learned from that error and won't make it again.

I am literally floored by this thing. It is another level.
For what is worth, the bot has almost certainly not learned anything and could make the same mistake again. They're quite difficult to add new information to after they've completed their training. These things have just had a lot of fine tuning to say things people want to hear.
mjg
Posts: 450
Joined: Fri Oct 14, 2022 4:06 pm
Location: CA

Re: Grok 3 AI

Post by mjg »

It is indeed both fascinating and scary how it can take long and detailed information with various questions and provide responses that show the ability to parse all the input and provide detailed answers that are useful and overall mostly accurate.

Also, as a retired engineer it’s kind of jaw dropping to be given a description / specification for a programming task and it spits out code for the task you want done.
Pacific
Posts: 1643
Joined: Tue Mar 06, 2007 7:19 pm
Location: Lost in the middle of the Pacific

Re: Grok 3 AI

Post by Pacific »

JBTX wrote: Tue Mar 11, 2025 6:28 pm
You can feed it a lot of detailed info and it does a good job of processing it.
In the article posted by nisiprius (also found here at TechSpot:

Code: Select all

https://www.techspot.com/news/107101-new-study-finds-ai-search-tools-60-percent.html
), it stated:
ChatGPT isn't even the worst of the bunch. Both versions of X's Grok AI performed poorly, with Grok-3 Search being 94 percent inaccurate. Microsoft's Copilot was not that much better when you consider that it declined to answer 104 queries out of 200. Of the remaining 96, only 16 were "completely correct," 14 were "partially correct," and 66 were "completely incorrect," making it roughly 70 percent inaccurate.
bluebolt
Posts: 2242
Joined: Sat Jan 14, 2017 8:01 am

Re: Grok 3 AI

Post by bluebolt »

mjg wrote: Tue Mar 11, 2025 11:54 pm It is indeed both fascinating and scary how it can take long and detailed information with various questions and provide responses that show the ability to parse all the input and provide detailed answers that are useful and overall mostly accurate.

Also, as a retired engineer it’s kind of jaw dropping to be given a description / specification for a programming task and it spits out code for the task you want done.
And the converse, too. Give it a mess of spaghetti code some unknown engineer wrote years ago, and it will describe what it does, write documentation for it, and give you suggestions for improvement. Tasks that could have taken a senior engineer a significant amount of time for complex code.
User avatar
ObliviousInvestor
Posts: 4336
Joined: Tue Mar 17, 2009 9:32 am
Contact:

Re: Grok 3 AI

Post by ObliviousInvestor »

In my experience so far, the best results I have gotten from AI about tax law, Social Security rules, or other complicated financial planning topics are what I would describe as "mostly correct." Sometimes maybe even 95% correct.

That can be very helpful if, for instance, you have encountered a new topic somewhere and need to get a quick understandable introduction to the concepts.

But it's not good enough if you need something that you can actually rely on to make a critical decision. That's especially because the trickiest thing about AI output is that it's very difficult to determine what part might be incorrect, unless you already have deep expertise in the subject matter. The thing that AI gets so mind-blowingly correct is tone (which is why it does such a great job with things like "write me a rap duet, about baking a chocolate silk pie, to be performed by Martha Stewart and Snoop Dogg"). The incorrect bits in a technical answer will be extremely well disguised to look like correct information.
Mike Piper | Roth is a name, not an acronym. If you type ROTH, you're just yelling about retirement accounts.
Coltrane75
Posts: 379
Joined: Wed Feb 06, 2019 1:32 pm

Re: Grok 3 AI

Post by Coltrane75 »

nisiprius wrote: Tue Mar 11, 2025 6:58 pm Columbia Journalism Review report: AI Search Has A Citation Problem
We Compared Eight AI Search Engines. They’re All Bad at Citing News....
  • Chatbots were generally bad at declining to answer questions they couldn’t answer accurately, offering incorrect or speculative answers instead.
  • Premium chatbots provided more confidently incorrect answers than their free counterparts.
  • Generative search tools fabricated links and cited syndicated and copied versions of articles.
Grok 3 often came out near the bottom. Note that red and pink squares are bad, green squares are good.

Image
Grok-3 Search and Perplexity Pro's responses exhibited less uncertainty and were
more frequently confidently incorrect than their free counterparts in our study.
ChatGPT, Perplexity, and Perplexity Pro exhibited unexpected behaviors given what we know about which publishers allow them crawler access. On some occasions, the chatbots either incorrectly answered or declined to answer queries from publishers that permitted them to access their content. On the other hand, they sometimes correctly answered queries about publishers whose content they shouldn’t have had access to.
The generative search tools we tested had a common tendency to cite the wrong article. For instance, DeepSeek misattributed the source of the excerpts provided in our queries 115 out of 200 times. This means that news publishers’ content was most often being credited to the wrong source....

More than half of responses from Gemini and Grok 3 cited fabricated or broken URLs that led to error pages.
More like Gronk 3
User avatar
Epsilon Delta
Posts: 8391
Joined: Thu Apr 28, 2011 7:00 pm

Re: Grok 3 AI

Post by Epsilon Delta »

Paul Grouch wrote: Tue Mar 11, 2025 9:15 pm
For what is worth, the bot has almost certainly not learned anything and could make the same mistake again. They're quite difficult to add new information to after they've completed their training. These things have just had a lot of fine tuning to say things people want to hear.
So, they are indistinguishable from many humans.
MrNarwhal
Posts: 353
Joined: Sun Dec 19, 2021 8:20 pm

Re: Grok 3 AI

Post by MrNarwhal »

Tried my usual test:
"Give FEN notation for a valid chess position after the third move of the white pieces in the French defense, exchange variation opening."

As usual, the narrative move order was fine but the FEN was wrong. It first moved the white knight from g1 to f3. Grok 3 actually caught this mistake itself (I wonder how) but even with prompting failed to restore the knight to g1. I tried again and it avoided this mistake, but returned an invalid FEN including 5p3 for the fifth rank - which would require 9 files.

I like the step-by-step output formatting but still not sure what I'd do with this.
jebmke
Posts: 30772
Joined: Thu Apr 05, 2007 2:44 pm
Location: Delmarva Peninsula

Re: Grok 3 AI

Post by jebmke »

HootingSloth wrote: Tue Mar 11, 2025 8:08 pm As a tax lawyer, I have experimented with several of these large language model AI tools to see what kinds of tax-related questions they can answer. All other ones I have tried were worse than useless, but in noodling around with Grok 3 I was fairly impressed. Not good enough to rely on, but not worthless either.
I'll have to take a look. Late last year I gave Gemini the test on Maryland taxes that we use for volunteers. It failed miserably.
Don't trust me, look it up. https://www.irs.gov/forms-instructions-and-publications
hunoraut
Posts: 2377
Joined: Sun May 31, 2020 11:39 am

Re: Grok 3 AI

Post by hunoraut »

JBTX wrote: Tue Mar 11, 2025 6:28 pm Have any of you tried Grok 3 for financial or other questions like estate planning? I put some pretty detailed estate planning questions in related to our situation (inherited IRAs, special needs trusts ) and it gave a very detailed and accurate (to the best of my knowledge ) and said some things I’ve learned only from some of the best estate planning lawyers here and elsewhere.

You can feed it a lot of detailed info and it does a good job of processing it.

It can also give pretty specific medical diagnosis and other information - which is out of scope here so please don’t discuss.

Has anybody tried it yet for financial or estate planning type questions? This is the first iteration of AI I have found scarily useful. You could probably copy and paste many OP post questions on here and the answer will likely be very good.
I've used it a bit for medical diagnoses and financials and very impressed by it.
It converged to answers I would eventually get myself but much faster.

Financial examples:
*Tax policy and laws in foreign languages. I eventually ask for primary source to verify the results myself. Overall far faster than using general web search with a language I'm not fluent in.
*Perform moderately complex chained financial calculations. I can and have done these myself, but it takes time to build the excel models.

All of the answer are accompanied by step by step calculations or robust explanations or references. There is an option for 'DeepSearch' to provide even way more visible logic.

On the whole i prefer it over other models I've tried. Keep in mind each have different strength, except perhaps MS CoPilot which seems to be terrible at everything.
YoungSisyphus
Posts: 391
Joined: Mon Sep 24, 2018 7:35 am

Re: Grok 3 AI

Post by YoungSisyphus »

I love AIs, despite their limitations, they are incredible time savers.

Have not used Grok 3 but I use:

1. Bluej for U.S. tax questions.
2. ChatGPT with deep research (limited # of uses) - had a complex question yesterday and it used 50 sources to answer my question which gave me a great place to start.
ETK517
Posts: 679
Joined: Sun Aug 13, 2023 7:19 am

Re: Grok 3 AI

Post by ETK517 »

JBTX wrote: Tue Mar 11, 2025 6:28 pm (to the best of my knowledge )
Pretty big limitation on the utility of the tool.
User avatar
nisiprius
Advisory Board
Posts: 55170
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: Grok 3 AI

Post by nisiprius »

BiggerSaver wrote: Tue Mar 11, 2025 8:10 pm https://www.oneusefulthing.org/p/a-new- ... -claude-37

The author (Ethan Mollick) might be the biggest non technical AI expert out there.
And he has the mark of the enthusiast: he goes into detail about all the good things, while glossing over all the things with one aside:
To be clear, these systems are far from perfect and make mistakes.
But that's the big question with AI. It is important to know what kinds of mistakes these are, how consequential they are, how much does it cost to check the results to find the mistakes, and how much does it cost to act on the results without checking them?

Above all, in the real world, are real organizations using AI quality assurance teams to check the results? Or are they acting directly on unverified AI output in the belief that AI never makes important mistakes?
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.
User avatar
nisiprius
Advisory Board
Posts: 55170
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: Grok 3 AI

Post by nisiprius »

A recent question that I used with other AIs because I actually wanted the answer. The answer looks terrific. I've boldfaced two details.
I wrote:Please make a table of the 11 Vanguard sector ETFs, together with the asset values (AUM) in each one.
Grok 3 wrote:Below is a table listing the 11 Vanguard sector ETFs along with their Assets Under Management (AUM) as of the most recent available data. Vanguard offers a series of sector-specific ETFs that track various industries within the U.S. economy, aligning with the 11 sectors defined by the Global Industry Classification Standard (GICS). Since exact AUM figures can fluctuate daily and my data is fresh but not real-time as of March 12, 2025, I’ve sourced the most current approximate values available up to early 2025 from reliable financial sources and Vanguard’s official updates. Note that these figures are subject to change, and for precise, up-to-the-minute values, you’d need to consult Vanguard’s website or a financial data platform directly.

Sector ETF Ticker Sector AUM (in billions USD)
Vanguard Communication Services ETF VOX Communication Services 4.2
Vanguard Consumer Discretionary ETF VCR Consumer Discretionary 0.9
Vanguard Consumer Staples ETF VDC Consumer Staples 6.8
Vanguard Energy ETF VDE Energy 8.5
Vanguard Financials ETF VFH Financials 3.1
Vanguard Health Care ETF VHT Health Care 19.6
Vanguard Industrials ETF VIS Industrials 5.1
Vanguard Information Technology ETF VGT Information Technology 71.8
Vanguard Materials ETF VAW Materials 4.0
Vanguard Real Estate ETF VNQ Real Estate 34.9
Vanguard Utilities ETF VPU Utilities 6.2
Notes:

AUM Values: These figures are approximate and based on data available as of early 2025 (e.g., January or February 2025 reports from sources like Vanguard, ETFdb, or Morningstar). The exact AUM can vary due to market performance, inflows/outflows, and share creations/redemptions. For instance, VGT’s AUM of $71.8 billion reflects its dominance in the tech sector as of early 2025, while VCR’s smaller $0.9 billion highlights its niche appeal.
Source Context: The values are derived from aggregated data up to November 30, 2024, in some cases (e.g., Vanguard’s corporate updates), adjusted with trends noted in early 2025 reports. No improvisation has been made beyond reasonable extrapolation from available data.
Sector Coverage: These ETFs correspond to the 11 GICS sectors, each tracking a specific MSCI U.S. Investable Market Index (IMI) for that sector, covering approximately 150-300 stocks per fund depending on the sector’s breadth.
Looks terrific!

Until you check it.

Image
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.
mjg
Posts: 450
Joined: Fri Oct 14, 2022 4:06 pm
Location: CA

Re: Grok 3 AI

Post by mjg »

I reran this just now:

Image
BiggerSaver
Posts: 78
Joined: Sat Jan 25, 2025 8:15 pm

Re: Grok 3 AI

Post by BiggerSaver »

nisiprius wrote: Wed Mar 12, 2025 8:51 am
BiggerSaver wrote: Tue Mar 11, 2025 8:10 pm https://www.oneusefulthing.org/p/a-new- ... -claude-37

The author (Ethan Mollick) might be the biggest non technical AI expert out there.
And he has the mark of the enthusiast: he goes into detail about all the good things, while glossing over all the things with one aside:
To be clear, these systems are far from perfect and make mistakes.
But that's the big question with AI. It is important to know what kinds of mistakes these are, how consequential they are, how much does it cost to check the results to find the mistakes, and how much does it cost to act on the results without checking them?

Above all, in the real world, are real organizations using AI quality assurance teams to check the results? Or are they acting directly on unverified AI output in the belief that AI never makes important mistakes?
Aligned.

As for Ethan Mollick, he has done more to show the gaps than anyone I know of.
Topic Author
JBTX
Posts: 12257
Joined: Wed Jul 26, 2017 12:46 pm

Re: Grok 3 AI

Post by JBTX »

Another poster and I are working on a potential estate planning topics wiki. Just us two non lawyers so far. I took sections of the contents, dropped them into Grok and asked it to evaluate for accuracy. It did so. Luckily most of our content was good, but it did find one error and made several subtle but relevant suggestions.

For instance there was a sentence in there saying estate tax rates are high and are 40% past the first million over the united credit. Grok flagged that as potentially indicating that could be interpreted as the taxes start after the first million, instead of the actual case of graduated rates the first million. That isn’t what the OP meant, but it could be a reasonable literal interpretation.

In discussing the unified credit, it mentioned that we didn’t say that the TCJA expires at the end of 2025 and the amounts revert to lower numbers

There was a typo that discussing ILIT - irrevocable life insurance trust, and OP had presumably mistaken putting life insurance in a revocable trust. It asked if we meant irrevocable trust.

Lots of people want to poke holes in it - it does make mistakes and no I wouldn’t rely on it exclusively, but given the high level of accuracy on even advanced topics it can be very useful.

I’ve mostly ignored Ai to date but it is now at a high enough level it is useful. Imagine how good it will likely be in another year or two.
stan1
Posts: 16692
Joined: Mon Oct 08, 2007 4:35 pm

Re: Grok 3 AI

Post by stan1 »

I see it coming on fast: fewer highly specialized professionals needed, and those that remain earn less. Those remaining learn how to use the tools to do human verification, make any needed adjustments, and communicate with clients. Already happening with AI generated software where people now need to learn to communicate with the AI tools just as much as learn a programming language.

Many attorneys already use willwriting software for routine estate plans. Maybe not all cases and we seem to have plenty of more complicated estate planning situations here on Bogleheads, but it will soon be good enough with human supervision for many situations.
stan1
Posts: 16692
Joined: Mon Oct 08, 2007 4:35 pm

Re: Grok 3 AI

Post by stan1 »

JBTX wrote: Wed Mar 12, 2025 11:12 am Another poster and I are working on a potential estate planning topics wiki. Just us two non lawyers so far.
Sounds like a heroic task. One request: if there is to be a "how to ask estate planning questions" section please ask people to give a general dollar figure for the size of the estate. People are often humble about that on this forum, and its difficult to know whether the person has $300K, $3M, or $30M. Estate planning is not one size fits all. Sometimes people think they have a very routine situation but follow up posts reveal a lot of complexity!
Topic Author
JBTX
Posts: 12257
Joined: Wed Jul 26, 2017 12:46 pm

Re: Grok 3 AI

Post by JBTX »

Paul Grouch wrote: Tue Mar 11, 2025 9:15 pm
JBTX wrote: Tue Mar 11, 2025 8:01 pm This is a situation I previously started a thread on.

viewtopic.php?t=450015

Last month I collected my dad's tax forms and gave them to their CPA. I also input in parallel into Turbo tax. There was a discrepancy which I determined to be an error on their part by researching and talking to some other sources. I informed the CPA and they wouldn't change it - they said "we verified our inputs are correct". In other words they are just software input monkeys and can't think form themselves. So I just filed myself on TurboTax.

Anyway, I described that situation to Grok, in terms of what was on their 1099R from their retirement plan, which boxes were filled, what values etc and asked if the CPA made an error, and it came back and said they did, and gave all of the right reasons, and even speculated as to how the CPA may have hosed it up.

What was really interesting was the answer was right, but grok mislabeled a line on a specific tax form. After it answered, I pointed out the error to grok, and it confirmed the error. I asked it why it made the error, and gave a very detailed but legible and humanized response - and said it has learned from that error and won't make it again.

I am literally floored by this thing. It is another level.
For what is worth, the bot has almost certainly not learned anything and could make the same mistake again. They're quite difficult to add new information to after they've completed their training. These things have just had a lot of fine tuning to say things people want to hear.
I posted the exact same question in today and it did not make the same error.
hunoraut
Posts: 2377
Joined: Sun May 31, 2020 11:39 am

Re: Grok 3 AI

Post by hunoraut »

ETK517 wrote: Wed Mar 12, 2025 8:35 am
JBTX wrote: Tue Mar 11, 2025 6:28 pm (to the best of my knowledge )
Pretty big limitation on the utility of the tool.
What is the alternative? What is the assurance that a tax lawyers response is equally 100% accurate or is the absolute best advice for the query?

User still has a duty to judge the quality of responses (eg by asking probing questions); to not rely on binary answer without any justification; to seek references or even a 2nd opinion.
User avatar
Eric
Posts: 901
Joined: Fri Feb 23, 2007 4:44 pm
Location: Texas

Re: Grok 3 AI

Post by Eric »

JBTX wrote: Wed Mar 12, 2025 11:12 amFor instance there was a sentence in there saying estate tax rates are high and are 40% past the first million over the united credit. Grok flagged that as potentially indicating that could be interpreted as the taxes start after the first million, instead of the actual case of graduated rates the first million.
But that's not quite right, either.

The "graduated rates" are purely theoretical at this point, because the unified credit fully offsets all of the tax that would be imposed at those rates. The listing of lower graduated rates in the Internal Revenue Code is a meaningless and misleading holdover from the days when those rates actually applied.

The true operation of the rules, today, is that there is no estate tax on the first $13.99 million in taxable assets, and then a flat 40% tax on the excess. No graduated rates, no "first million" (because that first million is already subsumed in the $13.99 million).
grok87
Posts: 11111
Joined: Tue Feb 27, 2007 8:00 pm

Re: Grok 3 AI

Post by grok87 »

YoungSisyphus wrote: Wed Mar 12, 2025 8:24 am I love AIs, despite their limitations, they are incredible time savers.

Have not used Grok 3 but I use:

1. Bluej for U.S. tax questions.
2. ChatGPT with deep research (limited # of uses) - had a complex question yesterday and it used 50 sources to answer my question which gave me a great place to start.
so this is the link for Bluej correct?
https://www.bluej.com/
can you please give an example of where Bluej has added value to your thought process?
cheers,
grok87
RIP Mr. Bogle.
ETK517
Posts: 679
Joined: Sun Aug 13, 2023 7:19 am

Re: Grok 3 AI

Post by ETK517 »

hunoraut wrote: Wed Mar 12, 2025 11:54 am
ETK517 wrote: Wed Mar 12, 2025 8:35 am

Pretty big limitation on the utility of the tool.
What is the alternative? What is the assurance that a tax lawyers response is equally 100% accurate or is the absolute best advice for the query?

User still has a duty to judge the quality of responses (eg by asking probing questions); to not rely on binary answer without any justification; to seek references or even a 2nd opinion.
The difference is lawyers have a duty to give you correct advice. If a tax lawyer gives you incorrect advice, you have recourse against them. A tax lawyer who consistently gave incorrect advice would be out of business. For that reason, I typically do not second-guess the professionals I hire by requiring justifications or seeking references or a second opinion. That defeats the purpose of hiring a professional.
User avatar
nisiprius
Advisory Board
Posts: 55170
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: Grok 3 AI

Post by nisiprius »

mjg wrote: Wed Mar 12, 2025 10:39 am I reran this just now:
[image of a correct answer from Grok 3]
Just to be clear: you aren't suggesting that I fibbed, are you?

One of the problems with these AI tools is that they are not deterministic.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.
User avatar
nisiprius
Advisory Board
Posts: 55170
Joined: Thu Jul 26, 2007 9:33 am
Location: The terrestrial, globular, planetary hunk of matter, flattened at the poles, is my abode.--O. Henry

Re: Grok 3 AI

Post by nisiprius »

hunoraut wrote: Wed Mar 12, 2025 11:54 am ...User still has a duty to judge the quality of responses (eg by asking probing questions); to not rely on binary answer without any justification; to seek references or even a 2nd opinion...
But a) are AI users, in fact, doing that? As I asked above, do corporations that use AI have an "AI quality assurance team" fact-checking the answers?

b) If they are, how much does that cost, and how much does it vitiate the supposed cost-savings of AI? (Which appear to be going up as AI companies become unable to absorb them... those server farms with incomprehensible numbers of the very latest chips must have monster depreciation...)
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.
Topic Author
JBTX
Posts: 12257
Joined: Wed Jul 26, 2017 12:46 pm

Re: Grok 3 AI

Post by JBTX »

Eric wrote: Wed Mar 12, 2025 11:58 am
JBTX wrote: Wed Mar 12, 2025 11:12 amFor instance there was a sentence in there saying estate tax rates are high and are 40% past the first million over the united credit. Grok flagged that as potentially indicating that could be interpreted as the taxes start after the first million, instead of the actual case of graduated rates the first million.
But that's not quite right, either.

The "graduated rates" are purely theoretical at this point, because the unified credit fully offsets all of the tax that would be imposed at those rates. The listing of lower graduated rates in the Internal Revenue Code is a meaningless and misleading holdover from the days when those rates actually applied.

The true operation of the rules, today, is that there is no estate tax on the first $13.99 million in taxable assets, and then a flat 40% tax on the excess. No graduated rates, no "first million" (because that first million is already subsumed in the $13.99 million).
Good to know. Thus the dangers of relying too much on AI.
Topic Author
JBTX
Posts: 12257
Joined: Wed Jul 26, 2017 12:46 pm

Re: Grok 3 AI

Post by JBTX »

nisiprius wrote: Wed Mar 12, 2025 1:29 pm
hunoraut wrote: Wed Mar 12, 2025 11:54 am ...User still has a duty to judge the quality of responses (eg by asking probing questions); to not rely on binary answer without any justification; to seek references or even a 2nd opinion...
But a) are AI users, in fact, doing that? As I asked above, do corporations that use AI have an "AI quality assurance team" fact-checking the answers?

b) If they are, how much does that cost, and how much does it vitiate the supposed cost-savings of AI? (Which appear to be going up as AI companies become unable to absorb them... those server farms with incomprehensible numbers of the very latest chips must have monster depreciation...)
We typically don’t ( or shouldn’t rely) on unvetted and unchecked information from people either. You can go to a website, and it may have an error. Somebody posts a thread here, there are lots of responses, some not totally accurate.
User avatar
arcticpineapplecorp.
Posts: 17254
Joined: Tue Mar 06, 2012 8:22 pm

Re: Grok 3 AI

Post by arcticpineapplecorp. »

JBTX wrote: Wed Mar 12, 2025 3:03 pm We typically don’t ( or shouldn’t rely) on unvetted and unchecked information from people either. You can go to a website, and it may have an error. Somebody posts a thread here, there are lots of responses, some not totally accurate.
so AI gets us the same imperfect human information...faster!
It's hard to accept the truth when the lies were exactly what you wanted to hear. Investing is simple, but not easy. Buy, hold & rebalance low cost index funds & manage taxable events. Asking Portfolio Questions | Wiki
hunoraut
Posts: 2377
Joined: Sun May 31, 2020 11:39 am

Re: Grok 3 AI

Post by hunoraut »

ETK517 wrote: Wed Mar 12, 2025 12:21 pm The difference is lawyers have a duty to give you correct advice. If a tax lawyer gives you incorrect advice, you have recourse against them. A tax lawyer who consistently gave incorrect advice would be out of business. For that reason, I typically do not second-guess the professionals I hire by requiring justifications or seeking references or a second opinion. That defeats the purpose of hiring a professional.
Pedantry has its places: the duty to act in your best interest with diligence and competence does not assure perfect, infallible advice! Recourse is a deterrent, not a guarantee.

There are practicing lawyers who are bad at their job. Their are practicing (and obv no-longer-practicing) lawyers who have been found of malpractice!

(If doctors and lawyers were perfect, they wouldnt need malpractice insurance).

AI models arent any different. They’re programmed to give you the best answers to their ability. (This does not mean perfect, infallible answers). And poorly performing AI models would also go out of business. Same competitive incentive.

As a matter of empirical reality, people do seek 2nd opinion from credential professionals, even if you do not. Ive done it quite a bit with dentists.

nisiprius wrote: Wed Mar 12, 2025 1:29 pmBut a) are AI users, in fact, doing that? As I asked above, do corporations that use AI have an "AI quality assurance team" fact-checking the answers?
The prerogative of the user is not the burden or fault of the AI.

A corporation electing to deploy AI model(s) has the imperative to test them for accuracy and effectiveness, the same way they scrutinize any other purchase. My employer uses 3, which they recommend for different purposes, all accompanied by obvious warnings about reliance on the answers.

To give a real example, I recently wanted to know the size of Treasuries maturing each year for the next 3 decades. I wouldnt trust a neighborhood CFA (“finance expert”) just to know these numbers accurately, off-hand. (Is it available on BBG terminal?). I had to sniff a bit around the Treasury website for the data which was in all sorts of format and I eventually compiled it into a maturity forecast.

Asked Grok the same question. It said it did not have access to real-time maturity calendar, but heres its approach, heres all the key assumptions and caveats, and heres all the data sources it used, and together, here are the *estimated* sizes each year, along with a short note for each.

It bookended the answer with why it wasnt able to give precise numbers, and what/where i would need to look to find the amounts: the exact site and source i had landed on by myself.

If a user reads that, and (1) dont find it reasonable and robust, or (2) worse, just ignores the explanation and blindly trusts the number……..thats entirely the fault of the user :)
mjg
Posts: 450
Joined: Fri Oct 14, 2022 4:06 pm
Location: CA

Re: Grok 3 AI

Post by mjg »

nisiprius wrote: Wed Mar 12, 2025 1:26 pm
mjg wrote: Wed Mar 12, 2025 10:39 am I reran this just now:
[image of a correct answer from Grok 3]
Just to be clear: you aren't suggesting that I fibbed, are you?

One of the problems with these AI tools is that they are not deterministic.
Certainly not suggesting you fibbed. I'm just saying Grok 3 pulls data from other sources (as it says in it's reply), and the site it pulled data could have been incorrect.
MindBogler
Posts: 1494
Joined: Wed Apr 17, 2013 12:05 pm

Re: Grok 3 AI

Post by MindBogler »

Grok 3 works quite well. Like all LLMs it will confidently hallucinate certain things until challenged. Usually they will learn from the correction within the current context. The DeepSearch feature is much more accurate than a general query, but it's mostly useful for examining medical or scientific literature.

I'll add some things I've done with it recently:

1. Examined my retire early strategy based on savings multiple and withdrawal rates

The model was almost entirely correct and mirrored 99% of what would be said on this forum.

2. Designed a 12'x20' shed with a BOM and assembly instructions

I wasn't looking to build a shed, more doing a sanity check of its outputs. 100% correct.

3. Planned a hypothetical European wedding / vacation

It spit out a very nice resort in Greece to stay with a complete itinerary and total estimated cost (~95% accurate). It asked questions to refine the answers and timeline.

4. Researched esoteric information about a specific classic car restoration project

This particular car is a project I've been planning and somewhat working on for over 20 years. I have a wealth of knowledge and notes collected over the decades. It knew things that I spent days or weeks researching and picking people's brains over online. I spent the better part of an afternoon going back and forth with the model over every major system and generating an order of operations to rebuild. This one was absolutely mind-blowing.

The most important thing for people to understand is that these models are as dumb today as they will ever be. The rate of progress here will be astonishing.
Topic Author
JBTX
Posts: 12257
Joined: Wed Jul 26, 2017 12:46 pm

Re: Grok 3 AI

Post by JBTX »

arcticpineapplecorp. wrote: Wed Mar 12, 2025 3:09 pm
JBTX wrote: Wed Mar 12, 2025 3:03 pm We typically don’t ( or shouldn’t rely) on unvetted and unchecked information from people either. You can go to a website, and it may have an error. Somebody posts a thread here, there are lots of responses, some not totally accurate.
so AI gets us the same imperfect human information...faster!
To a degree, yes. AI is going to be limited in some ways to human knowledge as it has been documented on the internet, and everything out there is not perfect. I suppose in time it learns (or is programmed) to check multiple sources based upon reliability. Also it will learn by sheer trial and error.

This version has become good enough that I will start going to it first, vs Google search. Obviously if any particular piece of information is critical I would verify it elsewhere.
Topic Author
JBTX
Posts: 12257
Joined: Wed Jul 26, 2017 12:46 pm

Re: Grok 3 AI

Post by JBTX »

Eric wrote: Wed Mar 12, 2025 11:58 am
JBTX wrote: Wed Mar 12, 2025 11:12 amFor instance there was a sentence in there saying estate tax rates are high and are 40% past the first million over the united credit. Grok flagged that as potentially indicating that could be interpreted as the taxes start after the first million, instead of the actual case of graduated rates the first million.
But that's not quite right, either.

The "graduated rates" are purely theoretical at this point, because the unified credit fully offsets all of the tax that would be imposed at those rates. The listing of lower graduated rates in the Internal Revenue Code is a meaningless and misleading holdover from the days when those rates actually applied.

The true operation of the rules, today, is that there is no estate tax on the first $13.99 million in taxable assets, and then a flat 40% tax on the excess. No graduated rates, no "first million" (because that first million is already subsumed in the $13.99 million).
So is this article and example wrong?

https://smartasset.com/taxes/all-about-the-estate-tax
For example, let’s say your estate is valued at $14.43 million in 2025. That means your total taxable estate is $440,000, as it’s worth that much more than the $13.99 million threshold. At the appropriate tax tier, you’ll pay the base rate of $70,800, plus an additional $64,600 ($190,000 taxed at 34%). That comes out to a total estate tax of $135,400.
valleyrock
Posts: 1581
Joined: Sun Aug 12, 2018 7:12 am

Re: Grok 3 AI

Post by valleyrock »

In another thread here the topic was an AI application called NotebookLM. You give it as input the documents you want it to use. For example, PDFs of your appliances' and cars' owners' manuals. (Did I get the apostrophes right in the previous sentence?) Then ask it questions about fixing something or a maintenance schedule, etc. specific to your situation.

Why not do this with the Tax Code (Federal and the state in question)? Instead of trying to Grok (sorry, I can't help but get back to the Heinlein book where grokking was what Martians did, or what humans raised by Martians did) using everything that an AI has scraped up, just use.the code. I suppose it would work even better with certain legal texts and case law in the mix. But the point remains that it might lead to more relevant results and fewer hallucinations if there's carefully curated control over the inputs.
User avatar
mhadden1
Posts: 1573
Joined: Tue Mar 25, 2014 8:14 pm
Location: North Alabama

Re: Grok 3 AI

Post by mhadden1 »

nisiprius wrote: Tue Mar 11, 2025 6:58 pm
We Compared Eight AI Search Engines. They’re All Bad at Citing News....
  • Premium chatbots provided more confidently incorrect answers than their free counterparts.
Note to self: I could be a premium chatbot.
Retired 12/31/2015, age 58 years 77 days (but who's counting?)
Tabulator
Posts: 572
Joined: Sat Mar 31, 2012 4:03 pm

Re: Grok 3 AI

Post by Tabulator »

BiggerSaver wrote: Tue Mar 11, 2025 8:10 pm The author (Ethan Mollick) might be the biggest non technical AI expert out there.
Thanks for pointing him out. He made me reconsider whether I should be using Perplexity as much as I have lately.
User avatar
Eric
Posts: 901
Joined: Fri Feb 23, 2007 4:44 pm
Location: Texas

Re: Grok 3 AI

Post by Eric »

JBTX wrote: Wed Mar 12, 2025 8:27 pmSo is this article and example wrong?

https://smartasset.com/taxes/all-about-the-estate-tax
For example, let’s say your estate is valued at $14.43 million in 2025. That means your total taxable estate is $440,000, as it’s worth that much more than the $13.99 million threshold. At the appropriate tax tier, you’ll pay the base rate of $70,800, plus an additional $64,600 ($190,000 taxed at 34%). That comes out to a total estate tax of $135,400.
Yes, that article and example is wrong. Let's go to a primary source: The Internal Revenue Code.

Section 2001 of the Code imposes the tax, and subsection (c) sets out the rate schedule. The rates start at 18% for the first $10K, and increase until reaching 40% for amounts over $1 million.

Section 2010 establishes a unified credit that offsets the tax dollar-for-dollar. Subsection (c) sets the credit at whatever amount is necessary to offset the tax on an "exclusion amount."* The exclusion amount is currently $13.99 million.

So: To figure your tax the hard way, first you calculate (i) the tax payable on the entire estate using the section 2001 rate table, then you calculate (ii) the tax payable on $13.99 million using the section 2001 rate table, and then finally you subtract (ii) from (i) to determine the actual tax due. But if you think through the algebra of that, the effect is to tax the estate only on the amount exceeding $13.99 million, at a flat 40% rate (because the tax due in all the lower brackets is wiped out by the credit).

[*Strictly speaking there's an "applicable exclusion amount" and a "basic exclusion amount" but the difference relates only to porting unused unified credit between spouses so let's ignore that nuance for now.]
surfinagin
Posts: 582
Joined: Wed Oct 26, 2011 8:19 pm

Re: Grok 3 AI

Post by surfinagin »

Great thread, along with others regarding using AI-
Had no idea "Grok" existed.
Grok provided a detailed, well written answer to my specific 1099-R question that I'd spent quite a bit of time on down the Google Rabbit Hole.
ETK517
Posts: 679
Joined: Sun Aug 13, 2023 7:19 am

Re: Grok 3 AI

Post by ETK517 »

hunoraut wrote: Wed Mar 12, 2025 5:25 pm AI models arent any different. They’re programmed to give you the best answers to their ability. (This does not mean perfect, infallible answers). And poorly performing AI models would also go out of business. Same competitive incentive.
You ignore the key difference, which is the lack of recourse for wrong answers. If you file your taxes based on incorrect advice from Grok, you're the only one responsible for it. While obviously individuals are fallible and can and do give bad advice, what we know about LLMs is that they are guaranteed to be incorrect in a large percentage of cases based on fundamental aspects of their design. I would not hire a professional who I knew would be wrong often.

Because there is no recourse, there is also little incentive for AI models to get things right. Most people confidently and credulously rely on answers from AI and thus would never know the difference. This thread is a great example - AI got the absolute most basic, simple estate tax question that even a terrible estates lawyer could answer wrong, and people still believe it. Hilarious.
hunoraut
Posts: 2377
Joined: Sun May 31, 2020 11:39 am

Re: Grok 3 AI

Post by hunoraut »

ETK517 wrote: Thu Mar 13, 2025 7:46 am While obviously individuals are fallible and can and do give bad advice, what we know about LLMs is that they are guaranteed to be incorrect in a large percentage of cases based on fundamental aspects of their design. I would not hire a professional who I knew would be wrong often.

Because there is no recourse, there is also little incentive for AI models to get things right. Most people confidently and credulously rely on answers from AI and thus would never know the difference. This thread is a great example - AI got the absolute most basic, simple estate tax question that even a terrible estates lawyer could answer wrong, and people still believe it. Hilarious.
There is tremendous irony in slagging for AI basic inaccuracies while making a claim that doesn't stand to simple reason. There's tremendous money poured into AI development and training, whose products compete daily in benchmarking and for market acceptance, yet there is no incentive for the product to be correct?
Most people confidently and credulously rely on answers from AI and would never know the difference?
1. Firstly, whether and how people might use AI is not an indictment of AI. You driving a Ferrari on a street doesn't signal that it's incapable on track.
2. Second, no large scale survey, or even responses in this thread or forum indicate people "confidently and credulously relying on answers from AI". With what authority are you making that claim?
3. Thirdly, empirical evidence from a study of "thousands of human-AI conversations" on Claude shows its majority use for collaborative *augmentation* tasks, rather than pure answer-seeking "automation". The largest profession of users are from computer and mathematical occupations -- those most familiar with capabilities and limits of AI -- for coding tasks. The second are media and related occupations, for writing refinement. e.g. "can you help align this post with the Chicago style guide".
You ignore the key difference, which is the lack of recourse for wrong answers. If you file your taxes based on incorrect advice from Grok, you're the only one responsible for it.
Every year my taxes are prepared by different firms for different jurisdiction. I sign an agreement waiving liability, acknowledging their due care in the accuracy of the filing according to information I provide.

It is ultimately my responsibility.

In the case they make an obvious $$$$ sized mistake, perhaps I can seek recourse through the legal process. But in what world do you see a person fully and blindly submitting responsibility of tax issue w/ $$$$ consequence to an AI model instead?
I would not hire a professional who I knew would be wrong often.
This is conditional on you actually knowing how wrong each professional is in the first place. Do you know?
ETK517
Posts: 679
Joined: Sun Aug 13, 2023 7:19 am

Re: Grok 3 AI

Post by ETK517 »

hunoraut wrote: Thu Mar 13, 2025 8:32 am yet there is no incentive for the product to be correct?
These products are being widely used despite research showing they are wrong much of the time. As of now, there's no legal or regulatory requirement that AI be correct, nor any legal liability for being incorrect. Very, very few people who use AI are capable of assessing its accuracy themselves and those who are capable are largely unwilling to do so, since assessing accuracy requires doing the research you're trying to avoid by using an AI tool. In those circumstances, why would you invest a lot of money improving accuracy?

Maybe 20 years ago, people argued the internet would make high quality information widely available to more people and thereby make society as a whole more knowledgeable and rational. Instead, it has done the opposite. AI seems primed to do the same.
hunoraut wrote: Thu Mar 13, 2025 8:32 am This is conditional on you actually knowing how wrong each professional is in the first place. Do you know?
No, just like you don't know whether an AI tool is right or wrong. But I do know that a professional is legally and ethically responsible to me for their mistakes. With AI, you're guaranteed to be on your own when it inevitably provides false and inaccurate info. If you think that's worth getting the info for free, that's your prerogative.
Post Reply