<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=Dataset_Ideal</id>
	<title>Dataset Ideal - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=Dataset_Ideal"/>
	<link rel="alternate" type="text/html" href="https://onnocenter.or.id/wiki/index.php?title=Dataset_Ideal&amp;action=history"/>
	<updated>2026-05-03T19:24:55Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.35.4</generator>
	<entry>
		<id>https://onnocenter.or.id/wiki/index.php?title=Dataset_Ideal&amp;diff=72245&amp;oldid=prev</id>
		<title>Onnowpurbo: /* Contoh Praktis: */</title>
		<link rel="alternate" type="text/html" href="https://onnocenter.or.id/wiki/index.php?title=Dataset_Ideal&amp;diff=72245&amp;oldid=prev"/>
		<updated>2025-04-01T01:00:13Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Contoh Praktis:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left diff-editfont-monospace&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 01:00, 1 April 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l52&quot; &gt;Line 52:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 52:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* '''Training ML (SVM/Random Forest)''': 3000–10.000 komentar ideal&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* '''Training ML (SVM/Random Forest)''': 3000–10.000 komentar ideal&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* '''Deep Learning (LSTM/BERT)''': 10.000+ komentar akan jauh lebih stabil dan akurat&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* '''Deep Learning (LSTM/BERT)''': 10.000+ komentar akan jauh lebih stabil dan akurat&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==Pranala Menarik==&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* [[Orange]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
	<entry>
		<id>https://onnocenter.or.id/wiki/index.php?title=Dataset_Ideal&amp;diff=72244&amp;oldid=prev</id>
		<title>Onnowpurbo: Created page with &quot;Pertanyaan bagus! Ukuran dataset yang '''ideal''' sebenarnya tergantung pada:  =='''1. Jenis Model yang Digunakan'''==  Semakin kompleks modelnya, semakin banyak data yang dib...&quot;</title>
		<link rel="alternate" type="text/html" href="https://onnocenter.or.id/wiki/index.php?title=Dataset_Ideal&amp;diff=72244&amp;oldid=prev"/>
		<updated>2025-04-01T00:59:51Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;Pertanyaan bagus! Ukuran dataset yang &amp;#039;&amp;#039;&amp;#039;ideal&amp;#039;&amp;#039;&amp;#039; sebenarnya tergantung pada:  ==&amp;#039;&amp;#039;&amp;#039;1. Jenis Model yang Digunakan&amp;#039;&amp;#039;&amp;#039;==  Semakin kompleks modelnya, semakin banyak data yang dib...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Pertanyaan bagus! Ukuran dataset yang '''ideal''' sebenarnya tergantung pada:&lt;br /&gt;
&lt;br /&gt;
=='''1. Jenis Model yang Digunakan'''==&lt;br /&gt;
&lt;br /&gt;
Semakin kompleks modelnya, semakin banyak data yang dibutuhkan.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Ukuran Dataset Ideal&lt;br /&gt;
|-&lt;br /&gt;
! Jenis Model !! Ukuran Dataset Ideal&lt;br /&gt;
|-&lt;br /&gt;
| '''Statistik sederhana''' || Ratusan data sudah cukup&lt;br /&gt;
|-&lt;br /&gt;
| '''Machine Learning klasik''' (Random Forest, SVM) || Ribuan data lebih baik&lt;br /&gt;
|-&lt;br /&gt;
| '''Deep Learning''' (LSTM, CNN) || Puluhan ribu – ratusan ribu data&lt;br /&gt;
|-&lt;br /&gt;
| '''Transformer (BERT, IndoBERT)''' || Ratusan ribu – jutaan data idealnya&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=='''2. Kompleksitas Masalah'''==&lt;br /&gt;
Kalau datanya sederhana (misal klasifikasi positif/negatif), data bisa lebih sedikit.&lt;br /&gt;
&lt;br /&gt;
Tapi kalau datanya kompleks (multi-kategori, data tidak seimbang, noisy), perlu lebih banyak data.&lt;br /&gt;
&lt;br /&gt;
=='''3. Keseimbangan Kelas (Class Balance)'''==&lt;br /&gt;
Idealnya, data terbagi rata antara kategori. Contoh:&lt;br /&gt;
- Positif: 1000&lt;br /&gt;
- Negatif: 1000&lt;br /&gt;
- Netral: 1000&lt;br /&gt;
&lt;br /&gt;
Kalau tidak seimbang (misal: 90% positif, 10% negatif), model bisa bias.&lt;br /&gt;
&lt;br /&gt;
=='''4. Tujuan Penggunaan'''==&lt;br /&gt;
&lt;br /&gt;
{|class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ Dataset Ideal&lt;br /&gt;
|-&lt;br /&gt;
! Tujuan !! Dataset Ideal&lt;br /&gt;
|-&lt;br /&gt;
| '''Penelitian kecil / eksperimen awal''' || 500–3000 data bisa cukup&lt;br /&gt;
|-&lt;br /&gt;
| '''Publikasi jurnal / akurasi tinggi''' || &amp;gt;10.000 data lebih disarankan&lt;br /&gt;
|-&lt;br /&gt;
| '''Produksi / aplikasi nyata''' || Semakin besar, semakin baik&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=='''Contoh Praktis:'''==&lt;br /&gt;
&lt;br /&gt;
Kalau kamu misalnya analisis '''komentar TikTok''', untuk:&lt;br /&gt;
* '''Eksperimen awal''': 1000–3000 komentar bisa dipakai&lt;br /&gt;
* '''Training ML (SVM/Random Forest)''': 3000–10.000 komentar ideal&lt;br /&gt;
* '''Deep Learning (LSTM/BERT)''': 10.000+ komentar akan jauh lebih stabil dan akurat&lt;/div&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
</feed>