
Joe Carlsmith Audio (Joe Carlsmith)
Explorez tous les épisodes de Joe Carlsmith Audio
Date | Titre | Durée | |
---|---|---|---|
05 Oct 2022 | Thoughts on being mortal | 00:12:44 | |
You can't keep any of it. The only thing to do is to give it away on purpose. | |||
05 Oct 2022 | On clinging | 00:17:48 | |
How can "non-attachment" be compatible with care? We need to distinguish between caring and clinging. | |||
05 Oct 2022 | Killing the ants | 00:15:08 | |
If you kill something, look it in the eyes as you do. | |||
05 Oct 2022 | Can you control the past? | 01:17:03 | |
Sometimes, you can “control” events you have no causal interaction with (for example, if you're a deterministic software twin). | |||
05 Oct 2022 | On future people, looking back at 21st century longtermism | 00:25:28 | |
I find imagining future people looking back on present-day longtermism (the view that positively influencing the long-term future should be a key moral priority) a helpful intuition pump, especially re: a certain kind of “holy sh**” reaction to existential risk, and to the possible size and quality of the future at stake. | |||
05 Oct 2022 | Against neutrality about creating happy lives | 00:23:19 | |
Making happy people is good. Just ask the golden rule. | |||
05 Oct 2022 | Actually possible: thoughts on Utopia | 00:28:39 | |
Life in the future could be profoundly good. I think this is an extremely important fact, and one that often goes under-estimated. | |||
05 Oct 2022 | On infinite ethics | 01:25:05 | |
Infinities puncture the dream of a simple, bullet-biting utilitarianism. But they're everyone's problem. | |||
09 Oct 2022 | Against the normative realist's wager | 00:42:48 | |
If your find a button that gives you a hundred dollars if a certain controversial meta-ethical view is true, but you and your family get burned alive if that view is false, should you press the button? No. | |||
01 Dec 2022 | Against meta-ethical hedonism | 01:02:29 | |
Can the epistemology of consciousness save moral realism and redeem experience machines? No. | |||
23 Dec 2022 | On sincerity | 01:35:02 | |
Nearby is the country they call life. Text version at: https://joecarlsmith.com/2022/12/23/on-sincerity | |||
25 Jan 2023 | Is Power-Seeking AI an Existential Risk? | 03:21:02 | |
Audio version of my report on existential risk from power-seeking AI. Text here: https://arxiv.org/pdf/2206.13353.pdf. Narration by Type III audio. | |||
16 Feb 2023 | Why should ethical anti-realists do ethics? | 00:53:29 | |
Who needs a system if you're free? Text version at https://joecarlsmith.com/2023/02/16/why-should-ethical-anti-realists-do-ethics | |||
17 Feb 2023 | Seeing more whole | 00:52:26 | |
On looking out of your own eyes. Text version at joecarlsmith.com. | |||
05 Mar 2023 | Problems of evil | 00:35:42 | |
Is everything holy? Can reality, in itself, be worthy of reverence? Text version here: https://joecarlsmith.com/2021/04/19/problems-of-evil | |||
19 Mar 2023 | Existential Risk from Power-Seeking AI (shorter version) | 00:55:03 | |
A shorter version of my report on existential risk from power-seeking AI. Forthcoming in an essay collection from Oxford University Press. Text version here: https://jc.gatspress.com/pdf/existential_risk_and_powerseeking_ai.pdf | |||
08 May 2023 | Predictable updating about AI risk | 01:03:14 | |
How worried about AI risk will we feel in the future, when we can see advanced machine intelligence up close? We should worry accordingly now. Text version here: https://joecarlsmith.com/2023/05/08/predictable-updating-about-ai-risk | |||
12 May 2023 | On the limits of idealized values | 01:00:14 | |
Contra some meta-ethical views, you can't forever aim to approximate the self you would become in idealized conditions. You have to actively create yourself, often in the here and now. | |||
15 Oct 2023 | In memory of Louise Glück | 00:21:22 | |
"It was, she said, a great discovery, albeit my real life." | |||
14 Nov 2023 | Introduction and summary of "Scheming AIs: Will AIs fake alignment during training in order to get power?" | 00:56:32 | |
This is a recording of the introductory section of my report "Scheming AIs: Will AIs fake alignment during training in order to get power?". This section includes a summary of the full report. The summary covers most of the main points and technical terminology, and I'm hoping that it will provide much of the context necessary to understand individual sections of the report on their own. (Note: the text of the report itself may not be public by the time this episode goes live.) | |||
15 Nov 2023 | Full audio for "Scheming AIs: Will AIs fake alignment during training in order to get power?" | 06:13:17 | |
This is the full audio for my report "Scheming AIs: Will AIs fake alignment during training in order to get power?" | |||
16 Nov 2023 | Varieties of fake alignment (Section 1.1 of "Scheming AIs") | 00:17:54 | |
This is section 1.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | A taxonomy of non-schemer models (Section 1.2 of "Scheming AIs") | 00:11:20 | |
This is section 1.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Why focus on schemers in particular? (Sections 1.3-1.4 of "Scheming AIs") | 00:31:17 | |
This is sections 1.3-1.4 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | On "slack" in training (Section 1.5 of "Scheming AIs") | 00:07:12 | |
This is section 1.5 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Situational awareness (Section 2.1 of "Scheming AIs") | 00:09:27 | |
This is section 2.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Two concepts of an "episode" (Section 2.2.1 of "Scheming AIs") | 00:12:08 | |
This is section 2.2.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Two sources of beyond-episode goals (Section 2.2.2 of "Scheming AIs") | 00:21:25 | |
This is section 2.2.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | "Clean" vs. "messy" goal-directedness (Section 2.2.3 of "Scheming AIs") | 00:16:44 | |
This is section 2.2.3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Is scheming more likely if you train models to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of "Scheming AIs") | 00:09:01 | |
This is sections 2.2.4.1-2.2.4.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | How useful for alignment-relevant work are AIs with short-term goals? (Section 2.2.4.3 of "Scheming AIs") | 00:09:21 | |
This is section 2.2.4.3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | The goal-guarding hypothesis (Section 2.3.1.1 of "Scheming AIs") | 00:19:11 | |
This is section 2.3.1.1 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs") | 00:22:54 | |
This is section 2.3.1.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs") | 00:24:34 | |
This is section 2.3.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Arguments for/against scheming that focus on the path SGD takes (Section 3 of "Scheming AIs") | 00:29:03 | |
This is section 3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | The counting argument for scheming (Sections 4.1 and 4.2 of "Scheming AIs") | 00:10:40 | |
This is sections 4.1 and 4.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Simplicity arguments for scheming (Section 4.3 of "Scheming AIs") | 00:19:37 | |
This is section 4.3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Speed arguments against scheming (Section 4.4-4.7 of "Scheming AIs") | 00:15:19 | |
This is section 4.4 through 4.7 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Summing up "Scheming AIs" (Section 5) | 00:15:46 | |
This is section 5 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
16 Nov 2023 | Empirical work that might shed light on scheming (Section 6 of "Scheming AIs") | 00:28:00 | |
This is section 6 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?” Text of the report here: https://arxiv.org/abs/2311.08379 | |||
27 Dec 2023 | In search of benevolence (or: what should you get Clippy for Christmas?) | 00:52:52 | |
What is altruism towards a paperclipper? Can you paint with all the colors of the wind at once? | |||
02 Jan 2024 | Gentleness and the artificial Other | 00:22:39 | |
AIs as fellow creatures. And on getting eaten. | |||
04 Jan 2024 | Deep atheism and AI risk | 00:46:59 | |
On a certain kind of fundamental mistrust towards Nature. | |||
08 Jan 2024 | When "yang" goes wrong | 00:21:32 | |
On the connection between deep atheism and seeking control. | |||
09 Jan 2024 | Does AI risk "other" the AIs? | 00:13:15 | |
Examining Robin Hanson's critique of the AI risk discourse. | |||
11 Jan 2024 | An even deeper atheism | 00:25:12 | |
Who isn't a paperclipper? | |||
16 Jan 2024 | Being nicer than Clippy | 00:47:30 | |
Let's be the sort of species that aliens wouldn't fear the way we fear paperclippers. | |||
18 Jan 2024 | On the abolition of man | 01:09:22 | |
What does it take to avoid tyranny towards to the future? | |||
21 Mar 2024 | On green | 01:15:13 | |
Examining a philosophical vibe that I think contrasts in interesting ways with "deep atheism." | |||
25 Mar 2024 | On attunement | 00:44:14 | |
Examining a certain kind of meaning-laden receptivity to the world. | |||
17 Jun 2024 | Loving a world you don't trust | 01:03:54 | |
Garden, campfire, healing water. | |||
17 Jun 2024 | First half of full audio for "Otherness and control in the age of AGI" | 03:07:29 | |
First half of the full audio for my series on how agents with different values should relate to one another, and on the ethics of seeking and sharing power. | |||
18 Jun 2024 | Second half of full audio for "Otherness and control in the age of AGI" | 04:11:02 | |
Second half of the full audio for my series on how agents with different values should relate to one another, and on the ethics of seeking and sharing power. | |||
21 Jun 2024 | Introduction and summary for "Otherness and control in the age of AGI" | 00:12:23 | |
This is the introduction and summary for my series "Otherness and control in the age of AGI." | |||
30 Sep 2024 | (Part 1, Otherness) Extended audio from my conversation with Dwarkesh Patel | 03:58:38 | |
Extended audio from my conversation with Dwarkesh Patel. This part focuses on my series "Otherness and control in the age of AGI." Transcript available on my website here: https://joecarlsmith.com/2024/09/30/part-1-otherness-extended-audio-transcript-from-my-conversation-with-dwarkesh-patel/ | |||
30 Sep 2024 | (Part 2, AI takeover) Extended audio from my conversation with Dwarkesh Patel | 02:07:33 | |
Extended audio from my conversation with Dwarkesh Patel. This part focuses on the basic story about AI takeover. Transcript available on my website here: https://joecarlsmith.com/2024/09/30/part-2-ai-takeover-extended-audio-transcript-from-my-conversation-with-dwarkesh-patel | |||
18 Dec 2024 | Takes on "Alignment Faking in Large Language Models" | 01:27:54 | |
What can we learn from recent empirical demonstrations of scheming in frontier models? Text version here: https://joecarlsmith.com/2024/12/18/takes-on-alignment-faking-in-large-language-models/ | |||
28 Jan 2025 | Fake thinking and real thinking | 01:18:47 | |
When the line pulls at your hand. | |||
13 Feb 2025 | How do we solve the alignment problem? | 00:08:43 | |
Introduction to a series of essays about paths to safe and useful superintelligence. Text version here: https://joecarlsmith.substack.com/p/how-do-we-solve-the-alignment-problem | |||
13 Feb 2025 | What is it to solve the alignment problem? | 00:40:13 | |
Also: to avoid it? Handle it? Solve it forever? Solve it completely? Text version here: https://joecarlsmith.substack.com/p/what-is-it-to-solve-the-alignment | |||
19 Feb 2025 | When should we worry about AI power-seeking? | 00:46:54 | |
Examining the conditions required for rogue AI behavior. Text version here: https://joecarlsmith.substack.com/p/when-should-we-worry-about-ai-power | |||
11 Mar 2025 | Paths and waystations in AI safety | 00:18:07 | |
On the structure of the path to safe superintelligence, and some possible milestones along the way. Text version here: https://joecarlsmith.substack.com/p/paths-and-waystations-in-ai-safety | |||
14 Mar 2025 | AI for AI safety | 00:27:51 | |
We should try extremely hard to use AI labor to help address the alignment problem. Text version here: https://joecarlsmith.com/2025/03/14/ai-for-ai-safety |