Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

News
AI News & Insights Featured Image

BLEUBERI: BLEU is a surprisingly effective reward for instruction following AI updates on arXiv.org

BLEUBERI: BLEU is a surprisingly effective reward for instruction followingcs.AI updates on arXiv.org arXiv:2505.11080v3 Announce Type: replace-cross
Abstract: Reward models are central to aligning LLMs with human preferences, but they are costly to train, requiring large-scale human-labeled preference data and powerful pretrained LLM backbones. Meanwhile, the increasing availability of high-quality synthetic instruction-following datasets raises the question: can simpler, reference-based metrics serve as viable alternatives to reward models during RL-based alignment? In this paper, we show first that BLEU, a basic string-matching metric, surprisingly matches strong reward models in agreement with human preferences on general instruction-following datasets. Based on this insight, we develop BLEUBERI, a method that first identifies challenging instructions and then applies Group Relative Policy Optimization (GRPO) using BLEU directly as the reward function. We demonstrate that BLEUBERI-trained models are competitive with models trained via reward model-guided RL across four challenging instruction-following benchmarks and three different base language models. A human evaluation further supports that the quality of BLEUBERI model outputs is on par with those from reward model-aligned models. Moreover, BLEUBERI models generate outputs that are more factually grounded than competing methods. Overall, we show that given access to high-quality reference outputs (easily obtained via existing instruction-following datasets or synthetic data generation), string matching-based metrics are cheap yet effective proxies for reward models during alignment. We release our code and data at https://github.com/lilakk/BLEUBERI.

 arXiv:2505.11080v3 Announce Type: replace-cross
Abstract: Reward models are central to aligning LLMs with human preferences, but they are costly to train, requiring large-scale human-labeled preference data and powerful pretrained LLM backbones. Meanwhile, the increasing availability of high-quality synthetic instruction-following datasets raises the question: can simpler, reference-based metrics serve as viable alternatives to reward models during RL-based alignment? In this paper, we show first that BLEU, a basic string-matching metric, surprisingly matches strong reward models in agreement with human preferences on general instruction-following datasets. Based on this insight, we develop BLEUBERI, a method that first identifies challenging instructions and then applies Group Relative Policy Optimization (GRPO) using BLEU directly as the reward function. We demonstrate that BLEUBERI-trained models are competitive with models trained via reward model-guided RL across four challenging instruction-following benchmarks and three different base language models. A human evaluation further supports that the quality of BLEUBERI model outputs is on par with those from reward model-aligned models. Moreover, BLEUBERI models generate outputs that are more factually grounded than competing methods. Overall, we show that given access to high-quality reference outputs (easily obtained via existing instruction-following datasets or synthetic data generation), string matching-based metrics are cheap yet effective proxies for reward models during alignment. We release our code and data at https://github.com/lilakk/BLEUBERI. Read More  

News
AI News & Insights Featured Image

Action Quality Assessment via Hierarchical Pose-guided Multi-stage Contrastive Regression AI updates on arXiv.org

Action Quality Assessment via Hierarchical Pose-guided Multi-stage Contrastive Regressioncs.AI updates on arXiv.org arXiv:2501.03674v2 Announce Type: replace-cross
Abstract: Action Quality Assessment (AQA), which aims at automatic and fair evaluation of athletic performance, has gained increasing attention in recent years. However, athletes are often in rapid movement and the corresponding visual appearance variances are subtle, making it challenging to capture fine-grained pose differences and leading to poor estimation performance. Furthermore, most common AQA tasks, such as diving in sports, are usually divided into multiple sub-actions, each of which contains different durations. However, existing methods focus on segmenting the video into fixed frames, which disrupts the temporal continuity of sub-actions resulting in unavoidable prediction errors. To address these challenges, we propose a novel action quality assessment method through hierarchically pose-guided multi-stage contrastive regression. Firstly, we introduce a multi-scale dynamic visual-skeleton encoder to capture fine-grained spatio-temporal visual and skeletal features. Then, a procedure segmentation network is introduced to separate different sub-actions and obtain segmented features. Afterwards, the segmented visual and skeletal features are both fed into a multi-modal fusion module as physics structural priors, to guide the model in learning refined activity similarities and variances. Finally, a multi-stage contrastive learning regression approach is employed to learn discriminative representations and output prediction results. In addition, we introduce a newly-annotated FineDiving-Pose Dataset to improve the current low-quality human pose labels. In experiments, the results on FineDiving and MTL-AQA datasets demonstrate the effectiveness and superiority of our proposed approach. Our source code and dataset are available at https://github.com/Lumos0507/HP-MCoRe.

 arXiv:2501.03674v2 Announce Type: replace-cross
Abstract: Action Quality Assessment (AQA), which aims at automatic and fair evaluation of athletic performance, has gained increasing attention in recent years. However, athletes are often in rapid movement and the corresponding visual appearance variances are subtle, making it challenging to capture fine-grained pose differences and leading to poor estimation performance. Furthermore, most common AQA tasks, such as diving in sports, are usually divided into multiple sub-actions, each of which contains different durations. However, existing methods focus on segmenting the video into fixed frames, which disrupts the temporal continuity of sub-actions resulting in unavoidable prediction errors. To address these challenges, we propose a novel action quality assessment method through hierarchically pose-guided multi-stage contrastive regression. Firstly, we introduce a multi-scale dynamic visual-skeleton encoder to capture fine-grained spatio-temporal visual and skeletal features. Then, a procedure segmentation network is introduced to separate different sub-actions and obtain segmented features. Afterwards, the segmented visual and skeletal features are both fed into a multi-modal fusion module as physics structural priors, to guide the model in learning refined activity similarities and variances. Finally, a multi-stage contrastive learning regression approach is employed to learn discriminative representations and output prediction results. In addition, we introduce a newly-annotated FineDiving-Pose Dataset to improve the current low-quality human pose labels. In experiments, the results on FineDiving and MTL-AQA datasets demonstrate the effectiveness and superiority of our proposed approach. Our source code and dataset are available at https://github.com/Lumos0507/HP-MCoRe. Read More  

News
AI News & Insights Featured Image

Building a Geospatial Lakehouse with Open Source and Databricks Towards Data Science

Building a Geospatial Lakehouse with Open Source and DatabricksTowards Data Science An example workflow for vector geospatial data science
The post Building a Geospatial Lakehouse with Open Source and Databricks appeared first on Towards Data Science.

 An example workflow for vector geospatial data science
The post Building a Geospatial Lakehouse with Open Source and Databricks appeared first on Towards Data Science. Read More  

News
AI News & Insights Featured Image

Data Visualization Explained (Part 4): A Review of Python Essentials Towards Data Science

Data Visualization Explained (Part 4): A Review of Python EssentialsTowards Data Science Learn the foundations of Python to take your data visualization game to the next level.
The post Data Visualization Explained (Part 4): A Review of Python Essentials appeared first on Towards Data Science.

 Learn the foundations of Python to take your data visualization game to the next level.
The post Data Visualization Explained (Part 4): A Review of Python Essentials appeared first on Towards Data Science. Read More  

News
AI News & Insights Featured Image

How to Build, Train, and Compare Multiple Reinforcement Learning Agents in a Custom Trading Environment Using Stable-Baselines MarkTechPost

How to Build, Train, and Compare Multiple Reinforcement Learning Agents in a Custom Trading Environment Using Stable-Baselines3MarkTechPost In this tutorial, we explore advanced applications of Stable-Baselines3 in reinforcement learning. We design a fully functional, custom trading environment, integrate multiple algorithms such as PPO and A2C, and develop our own training callbacks for performance tracking. As we progress, we train, evaluate, and visualize agent performance to compare algorithmic efficiency, learning curves, and decision
The post How to Build, Train, and Compare Multiple Reinforcement Learning Agents in a Custom Trading Environment Using Stable-Baselines3 appeared first on MarkTechPost.

 In this tutorial, we explore advanced applications of Stable-Baselines3 in reinforcement learning. We design a fully functional, custom trading environment, integrate multiple algorithms such as PPO and A2C, and develop our own training callbacks for performance tracking. As we progress, we train, evaluate, and visualize agent performance to compare algorithmic efficiency, learning curves, and decision
The post How to Build, Train, and Compare Multiple Reinforcement Learning Agents in a Custom Trading Environment Using Stable-Baselines3 appeared first on MarkTechPost. Read More  

News
AI News & Insights Featured Image

AI Agents: From Assistants for Efficiency to Leaders of Tomorrow? Towards Data Science

AI Agents: From Assistants for Efficiency to Leaders of Tomorrow?Towards Data Science How artificial intelligence is evolving from “simple” assistants to potential architect of our future-even CEOs and governors
The post AI Agents: From Assistants for Efficiency to Leaders of Tomorrow? appeared first on Towards Data Science.

 How artificial intelligence is evolving from “simple” assistants to potential architect of our future-even CEOs and governors
The post AI Agents: From Assistants for Efficiency to Leaders of Tomorrow? appeared first on Towards Data Science. Read More  

News
5 Common LLM Parameters Explained with Examples MarkTechPost

5 Common LLM Parameters Explained with Examples MarkTechPost

5 Common LLM Parameters Explained with ExamplesMarkTechPost Large language models (LLMs) offer several parameters that let you fine-tune their behavior and control how they generate responses. If a model isn’t producing the desired output, the issue often lies in how these parameters are configured. In this tutorial, we’ll explore some of the most commonly used ones — max_completion_tokens, temperature, top_p, presence_penalty, and
The post 5 Common LLM Parameters Explained with Examples appeared first on MarkTechPost.

 Large language models (LLMs) offer several parameters that let you fine-tune their behavior and control how they generate responses. If a model isn’t producing the desired output, the issue often lies in how these parameters are configured. In this tutorial, we’ll explore some of the most commonly used ones — max_completion_tokens, temperature, top_p, presence_penalty, and
The post 5 Common LLM Parameters Explained with Examples appeared first on MarkTechPost. Read More  

News
AI News & Insights Featured Image

The Power of Framework Dimensions: What Data Scientists Should Know Towards Data Science

The Power of Framework Dimensions: What Data Scientists Should KnowTowards Data Science Practical guidance and a case study
The post The Power of Framework Dimensions: What Data Scientists Should Know appeared first on Towards Data Science.

 Practical guidance and a case study
The post The Power of Framework Dimensions: What Data Scientists Should Know appeared first on Towards Data Science. Read More  

News
A New AI Research from Anthropic and Thinking Machines Lab Stress Tests Model Specs and Reveal Character Differences among Language Models MarkTechPost

A New AI Research from Anthropic and Thinking Machines Lab Stress Tests Model Specs and Reveal Character Differences among Language Models MarkTechPost

A New AI Research from Anthropic and Thinking Machines Lab Stress Tests Model Specs and Reveal Character Differences among Language ModelsMarkTechPost AI companies use model specifications to define target behaviors during training and evaluation. Do current specs state the intended behaviors with enough precision, and do frontier models exhibit distinct behavioral profiles under the same spec? A team of researchers from Anthropic, Thinking Machines Lab and Constellation present a systematic method that stress tests model specs
The post A New AI Research from Anthropic and Thinking Machines Lab Stress Tests Model Specs and Reveal Character Differences among Language Models appeared first on MarkTechPost.

 AI companies use model specifications to define target behaviors during training and evaluation. Do current specs state the intended behaviors with enough precision, and do frontier models exhibit distinct behavioral profiles under the same spec? A team of researchers from Anthropic, Thinking Machines Lab and Constellation present a systematic method that stress tests model specs
The post A New AI Research from Anthropic and Thinking Machines Lab Stress Tests Model Specs and Reveal Character Differences among Language Models appeared first on MarkTechPost. Read More  

News
AI News & Insights Featured Image

Google vs OpenAI vs Anthropic: The Agentic AI Arms Race Breakdown MarkTechPost

Google vs OpenAI vs Anthropic: The Agentic AI Arms Race BreakdownMarkTechPost In this article we will analyze how Google, OpenAI, and Anthropic are productizing ‘agentic’ capabilities across computer-use control, tool/function calling, orchestration, governance, and enterprise packaging. Agent platforms, not only models, now define competitive advantage. Google is aligning Gemini 2.0 with an enterprise control plane on Vertex AI and a new ‘front door’ called Gemini Enterprise.
The post Google vs OpenAI vs Anthropic: The Agentic AI Arms Race Breakdown appeared first on MarkTechPost.

 In this article we will analyze how Google, OpenAI, and Anthropic are productizing ‘agentic’ capabilities across computer-use control, tool/function calling, orchestration, governance, and enterprise packaging. Agent platforms, not only models, now define competitive advantage. Google is aligning Gemini 2.0 with an enterprise control plane on Vertex AI and a new ‘front door’ called Gemini Enterprise.
The post Google vs OpenAI vs Anthropic: The Agentic AI Arms Race Breakdown appeared first on MarkTechPost. Read More