<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://www.jayaprakash.net/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.jayaprakash.net/" rel="alternate" type="text/html" /><updated>2026-05-09T23:22:06+00:00</updated><id>https://www.jayaprakash.net/feed.xml</id><title type="html">Jayaprakash Sundararaj</title><subtitle></subtitle><author><name>Jayaprakash Sundararaj</name><email>osjp463@gmail.com</email></author><entry><title type="html">2026-02-16 tech news</title><link href="https://www.jayaprakash.net/2026-02-16-this-week-tech-news.html" rel="alternate" type="text/html" title="2026-02-16 tech news" /><published>2026-02-16T00:00:00+00:00</published><updated>2026-02-16T00:00:00+00:00</updated><id>https://www.jayaprakash.net/this-week-tech-news</id><content type="html" xml:base="https://www.jayaprakash.net/2026-02-16-this-week-tech-news.html"><![CDATA[<ul>
  <li>Google sells 100 Y debt bond for 20b <a href="https://www.reuters.com/business/alphabet-sells-bonds-worth-20-billion-fund-ai-spending-2026-02-10">link</a></li>
  <li>Novo sues HIMS for copying weight loss pill. HIM stocks falls 20%.</li>
  <li>OpenClaws has 200K+ github stars. It was trending heavily in x and news.
    <ul>
      <li>Rcently, OpenAI announced that OpenClaw founder is going to work in OpenAI</li>
      <li><a href="https://en.wikipedia.org/wiki/OpenClaw">wiki</a> <a href="https://techcrunch.com/2026/02/15/openclaw-creator-peter-steinberger-joins-openai/">news</a></li>
      <li>Also, there are numerous variations of OpenClaw popping up!</li>
    </ul>
  </li>
  <li>Trump claims DOW will reach 100K points in 3 years (by end of his term). DOW currently closed at 50K</li>
  <li>OpenAI rolling out Ads in ChatGPT for free tier for now. The code red is paused.</li>
  <li>Companies expected to be added to SP500: VRT, SOFI</li>
  <li>4% code submitted to github are written by Claude Code.</li>
  <li>Claude Code team produces C++ complier under 20K token usage cost <a href="https://www.anthropic.com/engineering/building-c-compiler">link</a>
<!-- Awesome: https://github.com/sindresorhus/awesome --></li>
</ul>]]></content><author><name>Jayaprakash Sundararaj</name><email>osjp463@gmail.com</email></author><category term="misc" /><summary type="html"><![CDATA[Google sells 100 Y debt bond for 20b link Novo sues HIMS for copying weight loss pill. HIM stocks falls 20%. OpenClaws has 200K+ github stars. It was trending heavily in x and news. Rcently, OpenAI announced that OpenClaw founder is going to work in OpenAI wiki news Also, there are numerous variations of OpenClaw popping up! Trump claims DOW will reach 100K points in 3 years (by end of his term). DOW currently closed at 50K OpenAI rolling out Ads in ChatGPT for free tier for now. The code red is paused. Companies expected to be added to SP500: VRT, SOFI 4% code submitted to github are written by Claude Code. Claude Code team produces C++ complier under 20K token usage cost link]]></summary></entry><entry><title type="html">ML Tokenization</title><link href="https://www.jayaprakash.net/ml-tokenization.html" rel="alternate" type="text/html" title="ML Tokenization" /><published>2021-01-16T00:00:00+00:00</published><updated>2021-01-16T00:00:00+00:00</updated><id>https://www.jayaprakash.net/ml-tokenization</id><content type="html" xml:base="https://www.jayaprakash.net/ml-tokenization.html"><![CDATA[<ul>
  <li>Questions
    <ul>
      <li><mark> Write about KV optimizations </mark></li>
      <li><mark> Can we add a new token and learn it effortlessly? </mark></li>
      <li><mark> Can we build token-less ML model? </mark></li>
      <li>Does tokenization affect multilingual NLP performance (performance and compute)?</li>
      <li>Why might byte-level tokenization be more robust across languages and domains?</li>
      <li>How do special tokens (e.g., <code class="language-plaintext highlighter-rouge">[CLS]</code>, <code class="language-plaintext highlighter-rouge">&lt;s&gt;</code>, <code class="language-plaintext highlighter-rouge">&lt;/s&gt;</code>, <code class="language-plaintext highlighter-rouge">[PAD]</code>) influence model training and attention behavior?</li>
      <li>How does tokenization differ for text, code, and speech data, and why?</li>
      <li>Why do some tokenizers prefer right padding while others use left padding?</li>
      <li>What happens if you fine-tune a model with a tokenizer different from the one used during pretraining?</li>
      <li>How do tokenizers for code-generation models differ from nlp/text generation tokenizers?</li>
      <li>Why do we need discrete tokenization - continuous, character-based, or byte-embedding approaches?</li>
      <li>Is it possible to have self-tokenizers or adaptive tokenizers in model architecture?</li>
    </ul>
  </li>
</ul>

<p>Tokenization breaks down text (or other data) into smaller units called tokens (represented as integers) before passing them into a machine learning model. Since machines don’t understand text directly but can work with numbers, the embedding layer converts these token IDs into dense vectors (embeddings) that capture semantic meaning. This foundational preprocessing step is crucial for all modern language models.</p>

<p>Tokens are not necessarily complete words, though early systems used word-level tokenization. Modern approaches use subwords (e.g., “play” + “ing”), individual characters (e.g., “p” + “l” + “a” + “y”), or even bytes for multilingual handling and processing non-text data. LLMs like GPT, Claude, and Gemini typically use BPE (Byte Pair Encoding) or SentencePiece tokenizers, which are flexible and eliminate out-of-vocabulary (OOV) issues. The most common tokenization algorithms include BPE, WordPiece, and SentencePiece, with vocabulary sizes typically ranging between 30,000 to 100,000 unique token IDs.</p>

<p>Vocabulary size presents important tradeoffs. If the vocabulary is too small, words are broken down and split into longer sequences. This means the model must process longer inputs for the same amount of text, requiring more compute. Additionally, semantic understanding suffers—imagine a word broken into individual characters, where each character may not carry useful information on its own. Conversely, if the vocabulary is too large, it includes many rarely used tokens and duplicates or similar variants. For example, you might have separate tokens for every number, date, phone number, or address mentioned in training data. Similar words might also have different token IDs (tokenize, tokenization, tokenizing, tokenized), leading to poor learning and generalization.</p>

<p>Modern tokenizers include various special tokens that serve specific purposes: <code class="language-plaintext highlighter-rouge">[PAD]</code> for padding, <code class="language-plaintext highlighter-rouge">[UNK]</code> for unknown or out-of-vocabulary words, <code class="language-plaintext highlighter-rouge">[CLS]</code> for start-of-sequence (classification token), <code class="language-plaintext highlighter-rouge">[SEP]</code> for separating sentences, and <code class="language-plaintext highlighter-rouge">[MASK]</code> for masked tokens in MLM. GPT-style models use <code class="language-plaintext highlighter-rouge">&lt;bos&gt;</code> and <code class="language-plaintext highlighter-rouge">&lt;eos&gt;</code> for beginning and end of sequence. Chat models add system/user/assistant role tokens, while coding agents include special tokens for reserved words in programming languages, including tabs for Python.</p>

<p>Padding and truncation strategies vary depending on the model architecture. Right padding is most common and the default in Hugging Face, used especially in classification, seq2seq, and BERT/T5 models. Left padding is useful for batched autoregressive generation with decoder-only models like GPT, code models, Qwen, and LLaMA, as it improves KV-cache optimizations. In autoregressive generation, the model only “looks left,” making the most recent tokens (at the end) most relevant. The latest tokens are the most informative for predicting the next word.</p>

<ul>
  <li>Papers
    <ul>
      <li><a href="https://arxiv.org/abs/1508.07909">Neural Machine Translation of Rare Words with Subword Units</a></li>
      <li><a href="https://arxiv.org/abs/1808.06226">SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing</a></li>
      <li><a href="https://aclanthology.org/2021.emnlp-main.160">Fast WordPiece Tokenization</a></li>
      <li><a href="https://aclanthology.org/2024.emnlp-main.40">Tokenization Is More Than Compression</a></li>
      <li><a href="https://en.wikipedia.org/wiki/Byte-pair_encoding">Wikipedia about BPE</a></li>
      <li>[Byte Pair Encoding is Suboptimal for Language Model Pretraining]</li>
      <li><a href="https://www.kaggle.com/code/william2020/how-openai-s-byte-pair-encoding-bpe-works">Kaggle Intro</a></li>
    </ul>
  </li>
  <li>BPE
    <ul>
      <li>todo</li>
    </ul>
  </li>
  <li>EXAMPLE -</li>
</ul>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">transformers</span> <span class="kn">import</span> <span class="n">AutoTokenizer</span>

<span class="c1">#   "codellama/CodeLlama-7b-hf"
#   "bigcode/starcoder2-3b"
#   "deepseek-ai/deepseek-coder-6.7b-base"
</span><span class="n">model_name</span> <span class="o">=</span> <span class="s">"bert-base-uncased"</span>
<span class="c1"># model_name = "deepseek-ai/deepseek-coder-6.7b-base"
</span><span class="n">tok</span> <span class="o">=</span> <span class="n">AutoTokenizer</span><span class="p">.</span><span class="n">from_pretrained</span><span class="p">(</span><span class="n">model_name</span><span class="p">)</span>

<span class="k">print</span><span class="p">(</span><span class="n">tok</span><span class="p">)</span>

<span class="k">print</span><span class="p">(</span><span class="s">"All special tokens:"</span><span class="p">,</span> <span class="n">tok</span><span class="p">.</span><span class="n">all_special_tokens</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"All special IDs:"</span><span class="p">,</span> <span class="n">tok</span><span class="p">.</span><span class="n">all_special_ids</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Special token map:"</span><span class="p">,</span> <span class="n">tok</span><span class="p">.</span><span class="n">special_tokens_map</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Base vocab size:"</span><span class="p">,</span> <span class="n">tok</span><span class="p">.</span><span class="n">vocab_size</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Total tokens (including added):"</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">tok</span><span class="p">))</span>

<span class="n">text</span> <span class="o">=</span> <span class="s">"Tokenization is Cool! 😎"</span>
<span class="c1"># text = """
#     import numpy as np;
#     import pandas as pd;
#     import matplotlib.pyplot as plt;
#     import seaborn as sns;
</span>
<span class="c1">#     a = np.zeros((2,10))
#     b = 1.001
# """
</span>
<span class="n">tokens</span> <span class="o">=</span> <span class="n">tok</span><span class="p">.</span><span class="n">tokenize</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Token texts:"</span><span class="p">,</span> <span class="n">tokens</span><span class="p">)</span>

<span class="n">token_ids</span> <span class="o">=</span> <span class="n">tok</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="n">text</span><span class="p">,</span> <span class="n">add_special_tokens</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Token ids:"</span><span class="p">,</span> <span class="n">token_ids</span><span class="p">)</span>
<span class="n">tokens</span> <span class="o">=</span> <span class="n">tok</span><span class="p">.</span><span class="n">convert_ids_to_tokens</span><span class="p">(</span><span class="n">token_ids</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Token texts from ids:"</span><span class="p">,</span> <span class="n">tokens</span><span class="p">)</span>

</code></pre></div></div>

<ul>
  <li>bert-base-uncased</li>
</ul>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>BertTokenizerFast(name_or_path='bert-base-uncased', vocab_size=30522, model_max_length=512, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'}, clean_up_tokenization_spaces=False, added_tokens_decoder={
 0: AddedToken("[PAD]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 100: AddedToken("[UNK]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 101: AddedToken("[CLS]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 102: AddedToken("[SEP]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 103: AddedToken("[MASK]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}
)
All special tokens: ['[UNK]', '[SEP]', '[PAD]', '[CLS]', '[MASK]']
All special IDs: [100, 102, 0, 101, 103]
Special token map: {'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'}
Base vocab size: 30522
Total tokens (including added): 30522
Token texts: ['token', '##ization', 'is', 'cool', '!', '[UNK]']
Token ids: [101, 19204, 3989, 2003, 4658, 999, 100, 102]
Token texts from ids: ['[CLS]', 'token', '##ization', 'is', 'cool', '!', '[UNK]', '[SEP]']
</code></pre></div></div>

<ul>
  <li>deepseek-ai/deepseek-coder-6.7b-base</li>
</ul>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>LlamaTokenizerFast(name_or_path='deepseek-ai/deepseek-coder-6.7b-base', vocab_size=32000, model_max_length=16384, is_fast=True, padding_side='left', truncation_side='right', special_tokens={'bos_token': '&lt;｜begin▁of▁sentence｜&gt;', 'eos_token': '&lt;｜end▁of▁sentence｜&gt;', 'pad_token': '&lt;｜end▁of▁sentence｜&gt;'}, clean_up_tokenization_spaces=False, added_tokens_decoder={
 32000: AddedToken("õ", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32001: AddedToken("÷", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32002: AddedToken("Á", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32003: AddedToken("ý", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32004: AddedToken("À", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32005: AddedToken("ÿ", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32006: AddedToken("ø", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32007: AddedToken("ú", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32008: AddedToken("þ", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32009: AddedToken("ü", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32010: AddedToken("ù", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32011: AddedToken("ö", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32012: AddedToken("û", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32013: AddedToken("&lt;｜begin▁of▁sentence｜&gt;", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True),
 32014: AddedToken("&lt;｜end▁of▁sentence｜&gt;", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True),
 32015: AddedToken("&lt;｜fim▁hole｜&gt;", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32016: AddedToken("&lt;｜fim▁begin｜&gt;", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32017: AddedToken("&lt;｜fim▁end｜&gt;", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32018: AddedToken("&lt;pad&gt;", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32019: AddedToken("&lt;|User|&gt;", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32020: AddedToken("&lt;|Assistant|&gt;", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
 32021: AddedToken("&lt;|EOT|&gt;", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),
}
)
All special tokens: ['&lt;｜begin▁of▁sentence｜&gt;', '&lt;｜end▁of▁sentence｜&gt;']
All special IDs: [32013, 32014]
Special token map: {'bos_token': '&lt;｜begin▁of▁sentence｜&gt;', 'eos_token': '&lt;｜end▁of▁sentence｜&gt;', 'pad_token': '&lt;｜end▁of▁sentence｜&gt;'}
Base vocab size: 32000
Total tokens (including added): 32022
Token texts: ['Ċ', 'ĠĠĠ', 'Ġimport', 'Ġnum', 'py', 'Ġas', 'Ġnp', ';', 'Ġ', 'Ċ', 'ĠĠĠ', 'Ġimport', 'Ġpand', 'as', 'Ġas', 'Ġp', 'd', ';', 'Ġ', 'Ċ', 'ĠĠĠ', 'Ġimport', 'Ġmat', 'plot', 'lib', '.', 'py', 'plot', 'Ġas', 'Ġpl', 't', ';', 'Ġ', 'Ċ', 'ĠĠĠ', 'Ġimport', 'Ġse', 'ab', 'orn', 'Ġas', 'Ġs', 'ns', ';', 'Ċ', 'Ċ', 'ĠĠĠ', 'Ġa', 'Ġ=', 'Ġnp', '.', 'zer', 'os', '((', '2', ',', '1', '0', '))', 'Ċ', 'ĠĠĠ', 'Ġb', 'Ġ=Ġ', '1', '.', '0', '0', '1', 'Ċ']
Token ids: [32013, 185, 315, 1659, 1181, 4016, 372, 21807, 26, 207, 185, 315, 1659, 21866, 281, 372, 265, 67, 26, 207, 185, 315, 1659, 1575, 13371, 2875, 13, 4016, 13371, 372, 568, 83, 26, 207, 185, 315, 1659, 386, 356, 1745, 372, 252, 3585, 26, 185, 185, 315, 245, 405, 21807, 13, 9888, 378, 5930, 17, 11, 16, 15, 1435, 185, 315, 270, 1412, 16, 13, 15, 15, 16, 185]
Token texts from ids: ['&lt;｜begin▁of▁sentence｜&gt;', 'Ċ', 'ĠĠĠ', 'Ġimport', 'Ġnum', 'py', 'Ġas', 'Ġnp', ';', 'Ġ', 'Ċ', 'ĠĠĠ', 'Ġimport', 'Ġpand', 'as', 'Ġas', 'Ġp', 'd', ';', 'Ġ', 'Ċ', 'ĠĠĠ', 'Ġimport', 'Ġmat', 'plot', 'lib', '.', 'py', 'plot', 'Ġas', 'Ġpl', 't', ';', 'Ġ', 'Ċ', 'ĠĠĠ', 'Ġimport', 'Ġse', 'ab', 'orn', 'Ġas', 'Ġs', 'ns', ';', 'Ċ', 'Ċ', 'ĠĠĠ', 'Ġa', 'Ġ=', 'Ġnp', '.', 'zer', 'os', '((', '2', ',', '1', '0', '))', 'Ċ', 'ĠĠĠ', 'Ġb', 'Ġ=Ġ', '1', '.', '0', '0', '1', 'Ċ']
</code></pre></div></div>

<ul>
  <li>Qwen/Qwen1.5-1.8B-Chat</li>
</ul>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Qwen2TokenizerFast(name_or_path='Qwen/Qwen1.5-1.8B-Chat', vocab_size=151643, model_max_length=32768, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '&lt;|im_end|&gt;', 'pad_token': '&lt;|endoftext|&gt;', 'additional_special_tokens': ['&lt;|im_start|&gt;', '&lt;|im_end|&gt;']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
 151643: AddedToken("&lt;|endoftext|&gt;", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 151644: AddedToken("&lt;|im_start|&gt;", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 151645: AddedToken("&lt;|im_end|&gt;", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}
)
All special tokens: ['&lt;|im_end|&gt;', '&lt;|endoftext|&gt;', '&lt;|im_start|&gt;']
All special IDs: [151645, 151643, 151644]
Special token map: {'eos_token': '&lt;|im_end|&gt;', 'pad_token': '&lt;|endoftext|&gt;', 'additional_special_tokens': ['&lt;|im_start|&gt;', '&lt;|im_end|&gt;']}
Base vocab size: 151643
Total tokens (including added): 151646
Token texts: ['Token', 'ization', 'Ġis', 'ĠCool', '!', 'ĠðŁĺ', 'İ']
Token ids: [3323, 2022, 374, 23931, 0, 26525, 236]
Token texts from ids: ['Token', 'ization', 'Ġis', 'ĠCool', '!', 'ĠðŁĺ', 'İ']
</code></pre></div></div>]]></content><author><name>Jayaprakash Sundararaj</name><email>osjp463@gmail.com</email></author><category term="machine-learning" /><category term="llm" /><summary type="html"><![CDATA[Questions Write about KV optimizations Can we add a new token and learn it effortlessly? Can we build token-less ML model? Does tokenization affect multilingual NLP performance (performance and compute)? Why might byte-level tokenization be more robust across languages and domains? How do special tokens (e.g., [CLS], &lt;s&gt;, &lt;/s&gt;, [PAD]) influence model training and attention behavior? How does tokenization differ for text, code, and speech data, and why? Why do some tokenizers prefer right padding while others use left padding? What happens if you fine-tune a model with a tokenizer different from the one used during pretraining? How do tokenizers for code-generation models differ from nlp/text generation tokenizers? Why do we need discrete tokenization - continuous, character-based, or byte-embedding approaches? Is it possible to have self-tokenizers or adaptive tokenizers in model architecture?]]></summary></entry><entry><title type="html">Notes on finance and investments</title><link href="https://www.jayaprakash.net/notes-on-finance-or-investment.html" rel="alternate" type="text/html" title="Notes on finance and investments" /><published>2017-07-31T00:00:00+00:00</published><updated>2017-07-31T00:00:00+00:00</updated><id>https://www.jayaprakash.net/notes-finance-or-investment</id><content type="html" xml:base="https://www.jayaprakash.net/notes-on-finance-or-investment.html"><![CDATA[<ul>
  <li>Links
    <ul>
      <li><a href="https://www.bogleheads.org/wiki/Bogleheads%C2%AE_investment_philosophy">bogleheads forum</a></li>
      <li><a href="https://stockanalysis.com/etf/compare/qqq-vs-vgt-vs-tqqq-vs-voo-vs-upro/">etf comparision</a></li>
      <li><a href="https://www.mrmoneymustache.com/2012/05/29/how-much-do-i-need-for-retirement/">4% rule</a></li>
    </ul>
  </li>
  <li>Advice / Investing principles
    <ul>
      <li>Live below your means - do not allow lifestyle inflation to outpace your income</li>
      <li>Never bear too much or too little risk - balance ambition with security</li>
      <li>Invest early and often - Compound interest is the eighth wonder of the world</li>
      <li>Never try to time the market - stay invested and let time work for you</li>
      <li>Stay the course - commit to your plan through market highs and lows</li>
      <li>Be greedy when others are fearful and fearful when others are greedy - Warren buffet</li>
      <li>Simplify your life
        <ul>
          <li>“Any darn fool can make something complex. It takes a genius to make something simple.” – Pete seeger</li>
          <li>declutter</li>
          <li>embrace minimalism</li>
        </ul>
      </li>
      <li>Avoid reading news especially clickbaity and fear-inducing ones</li>
      <li>DCA - Dollar cost averaging - invest steadily over time, regardless of market conditions</li>
      <li>Invest in your health</li>
    </ul>
  </li>
  <li>Order of investments should approximately follow:
    <ol>
      <li>401k match</li>
      <li>Health savings account (if HDHP is used), LPFSA</li>
      <li>401k pretax</li>
      <li>401K aftertax</li>
      <li>Roth IRA</li>
      <li>Medium interest debt</li>
      <li>529</li>
      <li>Taxable investment</li>
      <li>Low interest debt</li>
    </ol>
  </li>
  <li>Three fund portolio
    <ul>
      <li>VTI, BND, VXUS</li>
    </ul>
  </li>
  <li>
    <p>Do not invest in the low volume stocks/etfs, since they have higher spreads cost.</p>
  </li>
  <li>Choose low-expense funds.</li>
</ul>]]></content><author><name>Jayaprakash Sundararaj</name><email>osjp463@gmail.com</email></author><category term="finance" /><summary type="html"><![CDATA[Links bogleheads forum etf comparision 4% rule]]></summary></entry><entry><title type="html">Hello World</title><link href="https://www.jayaprakash.net/hello-world-101.html" rel="alternate" type="text/html" title="Hello World" /><published>2017-07-01T00:00:00+00:00</published><updated>2017-07-01T00:00:00+00:00</updated><id>https://www.jayaprakash.net/hello-world</id><content type="html" xml:base="https://www.jayaprakash.net/hello-world-101.html"><![CDATA[<p>Test, Test, Test.</p>]]></content><author><name>Jayaprakash Sundararaj</name><email>osjp463@gmail.com</email></author><category term="misc" /><summary type="html"><![CDATA[Test, Test, Test.]]></summary></entry></feed>